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Preface 


This is a textbook on classical mechanics at the intermediate level, but its 
main purpose is to serve as an introduction to a new mathematical language 
for physics called geometric algebra. Mechanics is most commonly formulated 
today in terms of the vector algebra developed by the American physicist 
J. Willard Gibbs, but for some applications of mechanics the algebra of 
complex numbers is more efficient than vector algebra, while in other applica- 
tions matrix algebra works better. Geometric algebra integrates all these 
algebraic systems into a coherent mathematical language which not only 
retains the advantages of each special algebra but possesses powerful new 
capabilities. 

This book covers the fairly standard material for a course on the mechanics 
of particles and rigid bodies. However, it will be seen that geometric algebra 
brings new insights into the treatment of nearly every topic and produces 
simplifications that move the subject quickly to advanced levels. That has 
made it possible in this book to carry the treatment of two major topics in 
mechanics well beyond the level of other textbooks. A few words are in order 
about the unique treatment of these two topics, namely, rotational dynamics 
and celestial mechanics. 

The spinor theory of rotations and rotational dynamics developed in this 
book cannot be formulated without geometric algebra, so a comparable 
treatment is not to be found in any other book at this time. The relation of the 
spinor theory to the matrix theory of rotations developed in conventional 
textbooks is completely worked out, so one can readily translate from one to 
the other. However, the spinor theory is so superior that the matrix theory is 
hardly needed except to translate from books that use it. In the first place, 
calculations with spinors are demonstrably more efficient than calculations 
with matrices. This has practical as well as theoretical importance. For 
example, the control of artificial satellites requires continual rotational com- 
putations that soon number in the millions. In the second place, spinors are 
essential in advanced quantum mechanics. So the utilization of spinors in the 
classical theory narrows the gap between the mathematical formulations of 
classical and quantum mechanics, making it possible for students to proceed 
more rapidly to advanced topics. 
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Celestial mechanics, along with its modern relative astromechanics, 1s 
essential for understanding space flight and the dynamics of the solar system. 
Thus, it is essential knowledge for the informed physicist of the space age. 
Yet celestial mechanics is scarcely mentioned in the typical undergraduate 
physics curriculum. One reason for this neglect is the belief that the subject is 
too advanced, requiring a complex formulation in terms of Hamilton-Jacobi 
theory. However, this book uses geometric algebra to develop a new formula- 
tion of perturbation theory in celestial mechanics which is well within the 
reach of undergraduates. The major gravitational perturbations in the solar 
system are discussed to bring students up to date in space age mechanics. The 
new mathematical techniques developed in this book should be of interest to 
anyone concerned with the mechanics of space flight. 

The last chapter of this book presents a new analysis of the foundations of 
mechanics. The main objective is a formulation of mechanics which is com- 
plete, in the sense that the essential premises of the theory are explicitly 
formulated, and externally coherent, in the sense that it articulates smoothly 
with neighboring branches of physics, principally electromagnetic theory and 
special relativity. The entire analysis is carried out from the perspective of 
Modeling Theory, a general theory about the development and deployment of 
mathematical models proposed as a definite philosophy of science. 

To provide an introduction to geometric algebra suitable for the entire 
physics curriculum, the mathematics developed in this book exceeds what is 
strictly necessary for a mechanics course, including a substantial treatment of 
linear algebra and transformation groups with the techniques of geometric 
algebra. Since linear algebra and group theory are standard tools in modern 
physics, it is important for students to become familiar with them as soon as 
possible. There are good reasons for integrating instruction in mathematics 
and physics. It assures that the mathematical background will be sufficient for 
the needs of physics, and the physics provides nontrivial applications of the 
mathematics as it develops. But most important, it affords an opportunity to 
teach students that the design and development of an efficient mathematical 
language for representing physical facts and concepts is the business of 
theoretical physics. That is one of the objectives of this book. 

In a sequel to this book called New Foundations for Mathematical Physics 
(NFII), the geometric algebra developed here will be extended to a complete 
mathematical language for electrodynamics, relativity and quantum theory, 
in short, a unified language for physics. The chapter on relativity in NFII is a 
smooth continuation of the present book, so it could easily be included at the 
end of a two semester course on mechanics. 

The most complete available treatment of geometric algebra and calculus is 
given in Clifford Algebra to Geometric Calculus, published by Reidel in the 
same series as the present book. That book is written at an advanced 
mathematical level and contains no direct applications to physics, so it is not 
recommended for beginners. However, it should be useful to mathematicians 
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and theoretical physicists, and it will be much more accessible to readers who 
have mastered the mathematics in the present book. 

The making of this book turned out to be much more difficult than I had 
anticipated, and could not have been completed without help from many 
sources. I am indebted to my NASA colleagues for educating me on the 
vicissitudes of celestial mechanics; in particular, Phil Roberts on orbital 
mechanics, Neal Hulkower on the three body problem, and, especially, Leon 
Blitzer for permission to draw freely on his lectures. I am indebted to Patrick 
Reany, Anthony Delugt and John Bergman for improving the accuracy of the 
text, and to Carmen Mendez and Denise Jackson for their skill and patience 
in typing a difficult manuscript. Most of all I am indebted to my wife Nancy 
for her unflagging support and meticulous care in preparing every one of the 
diagrams. 


DAVID HESTENES 


Chapter 1 


Origins of Geometric Algebra 


There is a tendency among physicists to take mathematics for granted, to 
regard the development of mathematics as the business of mathematicians. 
However, history shows that most mathematics of use in physics has origins in 
successful attacks on physical problems. The advance of physics has gone 
hand in hand with the development of a mathematical language to express 
and exploit the theory. Mathematics today is an immense and imposing 
subject, but there is no reason to suppose that the evolution of a mathemat- 
ical language for physics is complete. The task of improving the language of 
physics requires intimate knowledge of how the language is to be used and 
how it refers to the physical world, so it involves more than mathematics. It 1s 
one of the fundamental tasks of theoretical physics. 

This chapter sketches some historical high points in the evolution of 
geometric algebra, the mathematical language developed and applied in this 
book. It is not supposed to be a balanced historical account. Rather, the aim 
is to identify explicit principles for constructing symbolic representations of 
geometrical relations. Then we can see how to design a compact and efficient 
geometrical language tailored to meet the needs of theoretical physics. 


1-1. Geometry as Physics 


Euclid’s systematic formulation of Greek geometry (in 300 BC) was the first 
comprehensive theory of the physical world. Earlier attempts to describe the 
physical world were hardly more than a jumble of facts and speculations. But 
Euclid showed that from a mere handful of simple assumptions about the 
nature of physical objects a great variety of remarkable relations can be 
deduced. So incisive were the insights of Greek geometry that 1t provided a 
foundation for all subsequent advances in physics. Over the years it has been 
extended and reformulated but not changed in any fundamental way. 

The next comparable advance in theoretical physics was not consummated 
until the publication of Isaac Newton’s Principia in 1687. Newton was fully 
aware that geometry is an indispensible component of physics; asserting, 
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Fig. 1.1. Congruence of Line Segments. 


Euclid’s axioms provide rules which enable one to compare any pair of line 
segments. Segments AB and CF can be compared as follows. 

First, a line parallel to AB can be drawn through C. And a line parallel to AC can be 
drawn through B. The two lines intersect at a point D. The line segment CD is 
congruent to AB. 

Second, a circle with center C can be drawn through D. It intersects the line CF at a 
point E. The segment CE is congruent to CD and, by the assumed transitivity of the 
relation, congruent to AB. 

Third, the point F is either inside, on, or outside the circle, in which cases we say 
that the magnitude of CF is respectively, less than, equal to, or greater than the 
magnitude of AB. 

The procedure just outlined can, of course, be more precisely characterized by a formal 
deductive argument. But the point to be made here is that this procedure can be regarded 
as a theoretical formulation of basic physical operations involved in measurement. 

If AB is regarded as the idealization of a standard stick called a “ruler’’, the first step 
above may be regarded as a description of the translation of the stick to the place CD 
without changing its magnitude. Then the second step idealizes the reorientation of the 
ruler to place it contiguous to an idealized body CF so that a comparison (third step) 
can be made. Further assumptions are needed to supply the ruler with a “graduated 
scale” and so assign a unique magnitude to CF. 


“. .. the description of right lines and circles, upon which geometry is founded, belongs to 
mechanics. Geometry does not teach us to draw these lines, but requires them to be drawn... 
To describe right lines and circles are problems, but not geometrical problems. The solution of 
these problems is required from mechanics and by geometry the use of them, when so solved, ts 
shown; and it is the glory of geometry that from those few principles, brought from without, it is 
able to produce so many things. Therefore geometry is founded in mechanical practice, and is 
nothing but that part of universal mechanics which accurately proposes and demonstrates the art of 
measuring .. .”’ (italics added) 


As Newton avers, geometry is the theory on which the practice of measure- 
ment is based. Geometrical figures can be regarded as idealizations of 
physical bodies. The theory of congruent figures is the central theme of 
geometry, and it provides a theoretical basis for measurement when it is 
regarded as an idealized description of the physical operations involved in 
classifying physical bodies according to size and shape (Figure 1.1). To put it 
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Parallel] 
Tays from 
the sun 


Fig. 1.2. Measurement of the Earth. 


The most accurate of the early measurements of the earth’s circumference was 
made by Eratosthenes (~200 BC). He observed that at noon on the day of the 
summer solstice the sun shone directly down a deep well at Syene. At the same 
time at Alexandria, taken to be due north and 5000 stadia (~500 miles) away, the 
sun cast a shadow indicating it was 1/50 of a circle from zenith. By the equality of 
corresponding angles in the diagram this gives 50 X 500 = 25 000 miles for the 
circumference of the earth. 


another way, the theory of congruence specifies a set of rules to be used for 
classifying bodies. Apart from such rules the notions of size and shape have 
no meaning. 

Greek geometry was certainly not developed with the problem of measure- 
ment in mind. Indeed, even the idea of measurement could not be conceived 
until geometry had been created. But already in Euclid’s day the Greeks had 
carried out an impressive series of applications of geometry, especially to 
optics and astronomy (Figure 1.2), and this established a pattern to be 
followed in the subsequent development of trigonometry and the practical art 
of measurement. With these efforts the notion of an experimental science 
began to take shape. 

Today, ‘to measure” means to assign a number. But it was not always so. 
Euclid sharply distinguished “number” from ‘“‘magnitude’’. He associated the 
notion of number strictly with the operation of counting, so he recognized 
only integers as numbers; even the notion of fractions as numbers had not yet 
been invented. For Euclid a magnitude was a line segment. He frequently 
represented a whole number n by a line segment which is 7 times as long as 
some other line segment chosen to represent the number |. But he knew that 
the opposite procedure is impossible, namely, that it 1s impossible to dis- 
tinguish all line segments of different length by labeling them with numerals 
representing the counting numbers. He was able to prove this by showing the 
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side and the diagonal of a square cannot both be whole multiples of a single 
unit (Figure 1.3). 

The “one way” correspondence of counting numbers with magnitudes 
shows that the latter concept is the more general of the two. With admirable 
consistency, Euclid carefully distinguished between the two concepts. This is 
born out by the fact that he proves many theorems twice, once for numbers 
and once for magnitudes. This rigid distinction between number and magni- 
tude proved to be an impetus to progress in some directions, but an impedi- 
ment to progress in others. 

As is well known, even quite elementary problems lead to quadratic 
equations with solutions which are not integers or even rational numbers. 
Such problems have no solutions at all if only integers are recognized as 
numbers. The Hindus and the Arabs resolved this difficulty directly by 
generalizing their notion of number, but Euclid sidestepped it cleverly by 
reexpressing problems in arithmetic and algebra as problems in geometry. 
Then he solved for line segments instead of for numbers. Thus, he rep- 
resented the product x° as a square with a side of magnitude x. In fact, that 1s 
why we use the name “x squared” today. The product xy was represented by 
a rectangle and called the “rectangle” of the two sides. The term ‘x cubed” 
used even today originates from the representation of x’ by a cube with side of 
magnitude x. But there are no corresponding representations of x’ and higher 
powers of x in Greek geometry, so the Greek correspondence between 
algebra and geometry broke down. This “breakdown” impeded mathematical 
progress from antiquity until the seventeenth century, and its import is 
seldom recognized even today. 

Commentators sometimes smugly dismiss Euclid’s practice of turning every 


Fig. 1.3. The diagonal of a square is incommensurable with 
its side. 


This can be proved by showing that its contrary leads to a 
contradiction. Supposing, then, that a diagonal is an m-fold 
multiple of some basic unit while a side is an n-fold multiple of 
the same unit, the Pythagorean Theorem implies that m? = 
2n’. This equation shows that the integers m and n can be 
assumed to have no common factor, and also that m’” is even. 
But if m7 is even, then m is even, and m’ has 4 as a factor. 
Since n* = 1/2m? ,n’ and so nis also even. But the conclusion 
that m and n are both even contradicts the assumption that 
they do not have a common factor. 

Euclid gave an equivalent proof using geometric methods. 
The proof shows that V2 is not a rational number, that is, 
not expressible as a ratio of two integers. The Greeks could 
represent V2 by a line segment, the diagonal of a unit 
square. But they had no numeral to represent it. 
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algebra problem into an equivalent geometry problem as an inferior alterna- 
tive to modern algebraic methods. But we shall find good reasons to conclude 
that, on the contrary, they have failed to grasp a subtlety of far-reaching 
significance in Euclid’s work. The real limitations on Greek mathematics 
were set by the failure of the Greeks to develop a simple symbolic language to 
express their profound ideas. 


1-2. Number and Magnitude 


The brilliant flowering of science and mathematics in ancient Greece was 
followed by a long period of scientific stagnation until an explosion of 
scientific knowledge in the seventeenth century gave birth to the modern 
world. To account for this explosion and its long delay after the impressive 
beginnings of science in Greece is one of the great problems of history. The 
“great man” theory implicit in so many textbooks would have us believe that 
the explosion resulted from the accidental birth of a cluster of geniuses like 
Kepler, Galileo and Newton. “Humanistic theories” attribute it to the social, 
political and intellectual climate of the Renaissance, stimulated by a rediscov- 
ery of the long lost culture of Greece. The invention and exploitation of the 
experimental method is a favorite explanation among philosophers and 
historians of science. No doubt all these factors are important, but the most 
critical factor is often overlooked. The advances we know as modern science 
were not possible until an adequate number system had been created to 
express the results of measurement, and until a simple algebraic language had 
been invented to express relations among these results. While social and 
political disorders undoubtedly contributed to the decline of Greek culture, 
deficiencies in the mathematical formalism of the Greek science must have 
been an increasingly powerful deterrent to every scientific advance and to the 
transmission of what had already been learned. The long hiatus between 
Greek and Renaissance science is better regarded as a period of incubation 
instead of stagnation. For in this period the decimal system of arabic numerals 
was invented and algebra slowly developed. It can hardly be an accident that 
an explosion of scientific knowledge was ignited just as a comprehensive 
algebraic system began to take shape in the sixteenth and seventeenth 
centuries. 

Though algebra was associated with geometry from its beginnings, René 
Descartes was the first to develop it systematically into a geometrical lan- 
guage. His first published work on the subject (in 1637) shows how clearly he 
had this objective in mind: 


Any problem in geometry can easily be reduced to such terms that a knowledge of the lengths of 
certain straight lines is sufficient for its construction. Just as arithmetic consists of only four or five 
operations, namely, addition, subtraction, multiplication, division and the extraction of roots, 
which may be considered a kind of division, so in geometry, to find required lines it is merely 
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Fig. 2.1a. Addition and Subtraction of Line Segments. 


The geometrical theory of congruence (illustrated in Figure 1) gives a precise 
mathematical expression of the idea that a line segment can be moved around 
without changing its length. Taking congruence for granted, two line segments 
labeled by their lengths a and b can be joined end to end to create a new line 
segment with length a + b. This is the geometrical equivalent of the addition of 
numbers a and b. The commutative law of addition is reflected in the fact that the 
line segments can be joined at either end with the same result. The geometrical 
analog of subtraction is obtained as illustrated by joining the line segments to 
create a new line segment of length b—a. Since the length of a line segment is a 
positive number, negative numbers cannot be represented geometrically by label- 
ing line segments by length alone. 


necessary to add or subtract lines; or else, taking one line which J shall call unity in order to relate 
it as closely as possible to numbers, and which can in general be chosen arbitrarily, and having 
given two other lines, to find a fourth line which shail be to one of the given lines as the other is to 
unity (which is the same as multiplication); or, again, to find a fourth line which is to one of the 
given lines as unity is to the other (which is equivalent to division); or, finally to find one, two, or 
several mean proportionals between unity and some other line (which is the same as extracting 
the square root, cube root, etc., of the given line). And I shall not hesitate to introduce these 
arithmetical terms into geometry, for the sake of greater clearness. . .” 


Descartes gave the Greek notion of magnitude a happy symbolic form by 
assuming that every line segment can be uniquely represented by a number. 
He was the first person to label line segments by letters representing their 
numerical lengths. As he demonstrated, the aptness of this procedure resides 
in the fact that the basic arithmetic operations such as addition and subtrac- 
tion can be supplied with exact analogs in geometrical operations on line 
segments (Figures 2.la, 2.1b). One of his most significant innovations was to 
discard the Greek idea of representing the “product” of two line segments by 
a rectangle. In its stead he gave a rule for ‘‘multiplying” line segments which 
yielded another line segment in exact correspondence with the rule for 
multiplying numbers (Figure 2.2). This enabled him to avoid the apparent 
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—-a+(-b)= —a-b b+(-a@)= b-a 


Fig. 2.1b. Oriented Line Segments. 


Line segments can be assigned an orientation or sense as well as a length. There 
are exactly two different orientations called positive and negative; they are denoted 
respectively, by the signs + and — in arithmetic and represented by arrow heads in 
the above diagram. If a line segment is labeled by a, then —a denotes a line 
segment of the same length but opposite orientation. 

As indicated in the diagram, oriented line segments are added by joining them 
end to end to produce a new line segment with a unique orientation. Subtraction is 
reduced to addition; subtraction by a is defined as addition of —a. 

Orientation is a geometric notion which has been given a symbolic rendering in 
algebra by the signs + and —. 

Descartes had not grasped the notion of orientation. This accounts for the fact 
that he was prone to error when a problem called for the geometric representation 
of a negative number. 


‘ 


limitations of the Greek rule for ‘“‘geometrical multiplication”. Descartes 
could handle geometrical products of any order and he put this new ability to 
good use by showing how to use algebraic equations to describe geometric 
curves. This was the beginning of analytic geometry and a crucial step in the 
development of the mathematical language that makes modern physics poss- 
ible. Finally, Descartes made significant improvements in algebraic notations, 
putting algebra in a form close to the one we use today. 

It has been said that the things a man takes for granted is a measure of his 
debt to his culture. The assumption of a complete correspondence between 
numbers and line segments was the foundation of Descartes’ union of ge- 
ometry and algebra. A careful Greek logician like Eudoxus, would have 
demanded some justification for such a farreaching assumption. Yet, Descartes’ 
contemporaries accepted it without so much as a raised eyebrow. It did not 
seem revolutionary to them, because they were accustomed to it. In fact, 


D A 


Fig. 2.2. Multiplication of Line Seg- 
ments. 


Given line segments BD of length a 
and BC of length b, Descartes con- 
structed a new line segment BE of 
length ab by the following procedure: 
First lay off the segment AB of unit 
length. Then draw the line through D 
parallel to AC; it intersects the line 
BC at the point E. 

To make the correspondence be- 
tween algebra and geometry more di- 
rect, Descartes introduced the 
notation 


a— BDsb — BCsAB — Ia BE, 
= (10), 


(except he used the symbol a instead 
of =). 

The fact that BE has length ab jus- 
tifies calling the geometrical construc- 
tion of BE “multiplication”. A geo- 
metrical proof of this fact had already 
been given by Euclid in his account of 
the Greek “theory of proportions”’. 
Descartes’ rule gave the Greek theory 
at last a proper symbolic expression in 
algebraic language. 
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Fig. 2.3. Extraction of Roots. 


Given line segments a and 5, con- 
struct a half circle with diameter a + 
b. Construct the half chord x in- 
tersecting the diameter a distance a 
from the circle. The Pythagorean the- 
orem applied to the triangle in the 
above diagram gives x = Vab, or x = 
Va if b = 1. So the construction of x 
is a geometric analog of the arithmetic 
computation of a square root. 

This construction appears in Book 
II of Euclid’s Elements, from whence, 
no doubt, Descartes obtained it. But 
Descartes expresses it in the language 
of algebra. 


algebra and arithmetic had never been free of some admixture of geometry. 
But a union could not be consummated until the notion of number and the 
symbolism of algebra had been developed to a degree commensurate with 
Greek geometry. That state of affairs had just been reached when Descartes 
arrived on the historical scene. 

Descartes stated explicitly what everyone had taken for granted. If Descartes 
had not done it someone else would have. Indeed, Fermat independently 
achieved quite similar results. But Descartes penetrated closer to the heart of 
the matter. His explicit union of the notion of number with the Greek 
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geometric notion of magnitude sparked an intellectual explosion unequalled 
in all history. 

Descartes was not in the habit of acknowledging his debt to others, but ina 
letter to Mersenne in 1637 he writes, 


“As to the suggestion that what I have written could easily have been gotten from Vieta, the very 
fact that my treatise is hard to understand is due to my attempt to put nothing in it that I believed 
to be known either by him or by any one else. . . . ] begin the rules of my algebra with what Vieta 
wrote at the very end of his book, . . . Thus, I begin where he left off.” 


The contribution of Vieta has been too frequently undervalued. He is the one 
who explicitly introduced the idea of using letters to represent constants as 
well as unknowns in algebraic equations. This act lifted algebra out of its 
infancy by separating the study of special properties of individual numbers 
from the abstract study of the general properties all numbers. It revealed the 
dependence of the number concept on the nature of algebraic operations. 
Vieta used letters to denote numbers, and Descartes followed him by using 
letters to denote line segments. Vieta began the abstract study of rules for 
manipulating numbers, and Descartes pointed out the existence of similar 
rules for manipulating line segments. Descartes gives some improvements on 
the symbolism and algebraic technique of Vieta, but it is hard to say how 
much of this comes unacknowledged from the work of others. Before Vieta’s 
innovations, the union of geometry and algebra could not have been effected. 

The correspondence between numbers and line segments presumed by 
Descartes can be most simply expressed as the idea that numbers can be put 
into one to one correspondence with the points on a geometrical line (Figure 
2.4). This idea seems to be nearly as old as the idea of a geometrical line itself. 
The Greeks may have believed it at first, but they firmly rejected it when 
incommensurables were discovered (Figure 1.3). Yet Descartes and his 
contemporaries evidently regarded it as obvious. Such a significant change in 
attitude must have an interesting history! Of course, such a change was 
possible only because the notion of number underwent a profound evolution. 

Diophantes (250 AD), the last of the great Greek mathematicians, was 
probably the first to regard fractions as numbers. But the development most 
pertinent to the present discussion was the invention of algebraic numbers. 
This came about by presuming the existence of solutions to algebraic equa- 
tions and divising symbols to represent them. Thus the symbol V2 was 
invented to designate a solution of the equation x° = 2. Once the symbol V 2 
had been invented, it was hard to deny the reality of the number it names, and 
this number takes on a more concrete appearance when identified as the 
diagonal length of a unit square. In this way the incommensurables of the 
Greeks received number names at last, and with no reason to the contrary, it 
must have seemed natural to assume that all points on a line can be named by 
numbers. So it seemed to Descartes. Perhaps it is a good thing that there was 
no latter-day Eudoxus to dampen Descartes’ ardor by proving that it ts 
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-2 -| 
Fig. 2.4. The number line. 


The points on a line can be put into one to one correspondence with numbers by 
labeling (naming) them with decimal numerals. More specifically, every point can 
be uniquely labeled by an (infinite) decimal numeral with the form 

FG Aaa ee DD aug P 


ao ts 


Here, of course the a’s and b’s can have only integer values from zero to nine, and 
m and n represent natural numbers. 

The “real number line”’ can be defined ‘‘arithmetically” simply as the set of all 
decimal numerals. This definition may seem to be devoid of any geometrical 
content. However, the familiar arithmetic rules for adding and multiplying num- 
bers in decimal form correspond exactly to the rules needed to define a geometrical 
construction. Moreover, a decimal numeral can be interpreted as a set of instruc- 
tions for the unique determination of a geometrical point (or, equivalently, for the 
construction of a line segment) by elementary geometrical operations. Only a few 
simple conventions are required: 

(1) On the given line a point must be chosen and labeled zero. 

(2) The two orientations of the line must be labeled positive and negative. 

(3) A convenient line segment must be chosen as a unit. 

Then the “integer part” of the numeral can be interpreted as an instruction to begin 
at zero and lay off a,a,...a,, units in the direction designated by the sign of the 
numeral. The “decimal part” of the numeral b,b,.. .b, . . . then “‘says”’ to divide 
the next consecutive unit segment into ten equal parts and “move forward” b, of 
these units, etc. If the decimal is infinite, an infinite sequence of geometrical 
operations will be needed to determine the point. 

It should be noted that geometry requires that lines, points and units be given, 
but the nature of these entities is actually determined only by the geometrical 
relations they enter into. Geometry only specifies rules. Arithmetic may be re- 
garded as a formulation of geometrical rules without reference to undefined 
entities. This does not mean that undefined entities can be dispensed with. They 
are essential when arithmetic and geometry are applied to the physical world; then, 
a unit is typically given in the form of a physical object, and a line may be given by a 
ray of light; then a number typically specifies operations which have been or are to 
be carried out on physical objects. In modern physics the relations of mathematical 
entities to physical objects and operations are extremely intricate. 


impossible to name every point on a line by an algebraic number. Descartes 
did not even suspect that the circumference of a unit circle is not an algebraic 
number, but then, that was not proved until 1882. 

Deficiencies in the notion of number were not felt until the invention of 
calculus called for a clear idea of the “infinitely small’. A clear notion of 
“infinity” and with it a clear notion of the “‘continuum of real numbers”’ was 
not achieved until the latter part of the nineteenth century, when the real 
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number system was “‘arithmeticized”’ by Weierstrass, Cantor and Dedekind. 
‘“‘Arithmeticize” means define the real numbers in terms of the natural 
numbers and their arithmetic, without appeal to any geometric intuition of 
“the continuum”. Some say that this development separated the notion of 
number from geometry. Rather the opposite is true. It consumated the union 
of number and geometry by establishing at last that the real numbers can be 
put into one to one correspondence with the points on a geometrical line. The 
arithmetical definition of the ‘real numbers” gave a precise symbolic ex- 
pression to the intuitive notion of a continuous line (Figure 2.4). 

Descartes began the explicit cultivation of algebra as a symbolic system for 
representing geometric notions. The idea of number has accordingly been 
generalized to make this possible. But the evolution of the number concept 
does not end with the invention of the real number system, because there is 
more to geometry than the linear continuum. In particular, the notions of 
direction and dimension cry out for a proper symbolic expression. The cry has 
been heard and answered. 


1-3. Directed Numbers 


After Descartes, the use of algebra as a geometric language expanded with 
ever mounting speed. So rapidly did success follow success in mathematics 
and in physics, so great was the algebraic skill that developed that for sometime 
no one noticed the serious limitations of this mathematical language. 

Descartes expressed the geometry of his day in the algebra of his day. It did 
not occur to him that algebra could be modified to achieve a fuller symbolic 
expression of geometry. The algebra of Descartes could be used to classify 
line segments by length. But there is more to a line segment than length; it has 
direction as well. Yet the fundamental geometric notion of direction finds no 
expression in ordinary algebra. Descartes and his followers made up for this 
deficiency by augmenting algebra with the ever ready natural language. 
Expressions such as ‘the x-direction” and “the y-direction” are widely used 
even today. They are not part of algebra, yet ordinary algebra cannot be 
applied to geometry without them. 

Mathematics has steadily progressed by fashioning special symbolic systems 
to express ideas originally expressed in the natural language. The first math- 
ematical system, Greek geometry, was formulated entirely in the natural 
language. How else was mathematics to start? But, to use the words of 
Descartes, algebra makes it possible to go ‘as far beyond the treatment of 
ordinary geometry, as the rhetoric of Cicero is beyond the a, 6, c, of 
children”. How much more can be expected from further refinements of the 
geometrical language? 

The generalization of number to incorporate the geometrical notion of 
direction as well as magnitude was not carried out until some two hundred 
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years after Descartes. Though several people might be credited with con- 
ceiving the idea of “directed number”, Herman Grassmann, in his book of 
1844, developed the idea with precision and completeness that far surpassed 
the work of anyone else at the time. Grassmann discovered a rule for relating 
line segments to numbers that differed slightly from the rule adopted by 
Descartes, and this led to a more general notion of number. 

Before formulating the notion of a directed number, it is advantageous to 
substitute the short and suggestive word “scalar” for the more common but 
clumsy expression “real number’, and to recall the key idea of Descartes’ 
approach. Descartes united algebra and geometry by corresponding the 
arithmetic of scalars with a kind of arithmetic of line segments. More specifi- 
cally, if two line segments are congruent, that is, if one segment can be 
obtained trom the other by a translation and rotation, then Descartes would 
designate them both by the same positive scalar. Conversely, every positive 
scalar designates a “line segment” which possesses neither a place nor a 
direction, all congruent line segments being regarded as one and the same. 
Or, to put it in modern mathematical terminology, every positive scalar 
designates an equivalence class of congruent line segments. This ts the rule 
used by Descartes to relate numbers to line segments. 

Alternatively, Grassmann chose to regard two line segments as “equiv- 
alent’ if and only if one can be obtained from the other by a translation; only 
then would he designate them by the same symbol. If a rotation was required 
to obtain one line segment from another, he regarded the line segments as 
“possessing different directions” and so designated them by different sym- 
bols. These conventions lead to the idea of a “directed line segment” or 
vector as a line segment which can be moved freely from place to place 
without changing either its magnitude or its direction. To achieve a simple 
symbolic expression of this idea and yet distinguish vectors from scalars, 
vectors will be represented by letters in bold face type. If two line segments, 
designated by vectors a and b respectively, have the same magnitude and 
direction, then the vectors are said to be equal, and, as in scalar algebra, one 
writes 

a=b. (3.1) 

Of course the use of vectors to express the geometrical fact that line 
segments may differ in direction does not obviate the value of classifying line 
segments by length. But a simple formulation of the relation between scalars 
and vectors is called for. It can be achieved by observing that to every vector a 
there corresponds a unique positive scalar, here denoted by | a | and called 
the “magnitude” or the “length” of a. This follows from the correspondence 
between scalars and line segments which has already been discussed. Sup- 
pose, now, that a vector b has the same direction as a, but |b | = Aja 
where A is a positive scalar. This can be expressed simply by writing 


b=Aa. (3.2) 


’ 
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lal=(bl=1¢! 


Fig. 3.1. Vectors and Arrows. 


A vector a can be pictured as a “‘directed line segment” or arrow. The length of 
the arrow corresponds to the magnitude of the vector. Arrows representing vectors 
with the same direction lie on parallel lines. Arrows with the same length and 
direction can be regarded as different representations of one and the same vector 
no matter where they are located; so they can be labeled by one and the same 
vector symbol. 

Arrows that do not lie on parallel lines must be labeled by different vector 
symbols, even if they have the same length. 


But this can be interpreted as an equation defining the multiplication of a 
vector by a scalar. Thus, multiplication by a positive scalar changes the 
magnitude of a vector but not its direction (Figure 3.2). This operation 1s 
commonly called a dilation. If A > 1, it is an expansion, since then | b | > | a]. 
But if A < 1, it is a contraction, since then | b | < | a| . Descartes’ geometri- 
cal construction for ‘‘multiplying” two line segments (Figure 2.2) is a dila- 
tion of one line seg- 
ment by the magni- 
tude of the other. 
Equation (3.2) al- 
lows one to write 


= |a|a, where 


a 
|a| =1. (3.3) | a 
This expresses the vec- la 


tor a as the product 
of its magnitude | a | 
with a “unit vector” a. Multiplication by a positive scalar changes the length but 
The “unit” 4 uniquely not the direction of a vector. 

specifies the direction 


Fig. 3.2. Illustrating dilation. 
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of a, so Equation (3.3) can be regarded as a decomposition of a into 
magnitude and direction. 

If Equation (3.2) is supposed to hold, then multiplication of a vector a by 
zero results in a vector with zero magnitude. Express this by writing 


(0)a = 0. (3.4) 


Since the direction associated with a line segment of zero length seems to be 
of no consequence, it is natural to assume that the zero vector on the right 
side of (3.4) is a unique number no matter what the direction of a. Moreover, 
it will be seen later that there is good reason to regard the zero vector as one 
and the same number as the zero scalar. So the zero on the right side of (3.4) 
is not written in bold face type. 

Grassmann may have been the first person to clearly understand that the 
significance of a number lies not in itself but solely in its relation to other 
numbers. The notion of number resides in the rules for combining two 
numbers to get a third. Grassmann looked for rules for combining vectors 
which would fully describe the geometrical properties of directed line seg- 
ments. He noticed that two directed line segments connected end to end 
determine a third, which may be regarded as their sum. This “geometrical 
sum” of directed line segments can be simply represented by an equation for 
corresponding vectors a, b, s: 


at+b=s. G5) 


b a+b+e 


sz=at+hb a+b=bia (a+b) +c =a+(b+c) 


Fig. 3.3. Addition of Arrows. 


Two arrows, labeled respectively by vectors a and b above can be ‘“‘added”’ geometri- 
cally by joming the tip of one with the tail of the other and drawing in the arrow 
labeled s that connects the remaining tip and tail. The properties of vector addition are 
determined by assuming that addition of vectors a and b to get s corresponds exactly to 
this geometrical construction. 

Since the same arrow s is obtained whether the tip of a is joined to the tail of b or the 
tip of b is joined to the tail of a, vector addition must be commutative. 

Since, as the figure shows, the result of adding arrows a + b to c is the same as the 
result of adding a to b + c, vector addition must be associative. 
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This procedure is like the one used by Descartes to relate ‘geometrical 
addition of line segments” to addition of scalars, except that the definition of 
“geometrically equivalent” line segments is different. 

The rules for adding vectors are determined by the assumed correspon- 
dence with directed line segments. As shown in Figure 3.3, vector addition, 
like scalar addition, must obey the commutative rule, 

The rules for adding vectors are determined by the assumed correspon- 
dence with directed line segments. As shown in Figure 3.3, vector addition, 
like scalar addition, must obey the commutative rule, 


a+b=b+a,s, (3.6) 
and the associative rule, 
(a+b) +c=a+t (b+ c). (Ga) 


As in scalar algebra, the number zero plays a special role in vector addition. 
Thus, 


a+0=a. (3.8) 


Moreover, to every vector there corresponds one and only one vector b which 
satisfies the equation 


a+b=0. (3.9) 


This unique vector is called the negative of a and denoted by —a (Figure 3.4). 


Fig. 3.4. Negative vectors. Fig. 3.5. Comparing addition and 
subtraction of arrows and vectors. 


An arrow representing —a differs from one 
representing a only in having its tip point in the 
opposite direction. We say that a and —a have 
opposite orientation. 


The existence of negatives makes it possible to define subtraction as 
addition of a negative. Thus 


c-—a=c+ (—a). (3.10) 


Subtraction and addition are compared in Figure 3.5. 
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The existence of negatives also makes it possible to define multiplication by 
the scalar —1 by the equation 


(—1)a = —a. (3.11) 


This equation justifies interpreting -1 as a representation of the operation of 
reversing direction (i.e. orientation, as explained in Figure 2.1b). Then the 
equation (-1)° = 1 simply expresses the obvious geometrical fact that by 
reversing direction twice one reproduces the original direction. In this way 
the concept of directed numbers leads to an operational interpretation of 
negative numbers. 

Now that the geometrical meaning of multiplication by minus one is 
understood, it is obvious that Equation (3.2) is meaningful even if A is a 
negative scalar. Vectors which are scalar multiples of one another are said to 
be codirectional or collinear. 


1-4. The Inner Product 


A great many significant geometrical theorems can be simply expressed and 
proved with the algebraic rules for vector addition and scalar multiplication 
which have just been set down. However, the algebraic system as it stands 
cannot be regarded as a complete symbolic expression of the geometric 
notions of magnitude and direction, because it fails to fully indicate the 
difference between scalars and vectors. This difference is certainly not re- 
flected in the rules for addition, which are the same for both scalars and 
vectors. In fact, the distinction between scalars and vectors still resides only in 
their geometric interpretations, that is, in the different rules used to corre- 
spond them with line segments. 

The opportunity to give the notion of direction a full algebraic expression 
arises when the natural question of how to multiply vectors is entertained. 
Descartes gave a rule for “*multiplying” line segments, but his rule does not 
depend on the direction of the line segments, and it already has an algebraic 
expression as a dilation. Yet the general approach of Descartes can be 
followed to a different end. One can look for a significant geometrical 
construction based on two line segments that does depend on direction; then, 
by correspondence, use this construction to define the product of two vectors. 

One need not look far, for one of the most familiar constructions of 
ordinary geometry is readily seen to meet the desired specifications, namely, 
the perpendicular projection of one line segment on another (Figure 4.1). A 
study of this construction reveals that, though it depends on the relative 
directions of the line segments to be ‘‘multiplied”, the result depends on the 
magnitude of only one of them. This result can be multiplied by the magni- 
tude of the other to get a more symmetrical relation. In this way one is led to 
the following rule for multiplying vectors: 
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Fig. 4.1. Perpendicular projection. 


The perpendicular projection of OA on the line OB is the line segment OC. The 
adjective “perpendicular” expresses the stipulation that the line AC be perpendicular 
to OC. If @is the radian measure of the angle AOB and if a = OA is the length of OA, 
then, one can write 


OC = a cos 0. 
Likewise, if b = AB, then 
OD = bcos 86. 


In the second diagram, to distinguish the projection of OA‘ on OB from the projection 
of OA on the same line where OA’ = a = OA, it is convenient to regard OC' and OC 
as the line segments with the same magnitude but opposite orientation. This can be 
expressed by writing 


OC' = acos(x — 6) = —acos 6= —-OC. 


This scalar quantity clearly depends on the orientation of OB, because OB determines 
the line from which the angle is measured. 

By taking orientation into account, we go slightly beyond the Greek idea of 
perpendicular projection. 


Define the “inner product” of two directed line segments, denoted by 
vectors a and b respectively, to be the oriented line segment obtained by 
dilating the projection of a on b by the magnitude of b. The magnitude and 
the orientation of the resulting line segment is a scalar; denote this scalar by 
a‘b and call it the inner product of vectors a and b. 

This definition of a:b implies the following relation to the angle 6 between a 
and b: 


ab = |a| |b| cos @. (4.1) 


This expression is commonly taken as the definition of a-b, but that calls for 
an independent definition of cos 6, which would be out of place here. 
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It should be noted that the geometrical construction on which the definition 
of a:b is based actually gives a line segment directed along the same line as a 
or b; the magnitude and relative orientation of this line segment were used in 
the definition of a:b, but its direction was not. There is good reason for this. It 
is necessary if the algebraic rule for multiplication is to depend only on the 
relative directions of a and b. Thus, as defined, the numerical value of a-b is 
unaffected by any change in the directions of a and b which leaves the angle 
between a and b fixed. Moreover, a:b has the important symmetry property 


ab = bea. (4.2) 


This expresses the fact that the projection of a on b dilated by | b/ gives the 
same result as the projection of b on a dilated by |a| (Figure 4.2). 

The inner product has, besides (4.2), several basic algebraic properties 
which can easily be deduced from its definition by correspondence with 
perpendicular projection. Its relation to scalar multiplication of vectors 1s 
expressed by the rule 


(Aa)-b = A(a‘b) = a-(Ab). (4.3) 


Here A can be any scalar — positive, negative or zero. Its relation to vector 
addition is expressed by the distributive rule 


a(b+c)=ab+ ace. (4.4) 
(Figure 4.3). The magnitude of a vector is related to the inner product by 
aa = |al? > 0. (4.5) 


Of course, a-a = 0 if and only if a = 0. 

The inner product greatly increases the usefulness of vectors, for it can be 
used to compute angles and the lengths of line segments. Important theorems 
of geometry and trigonometry can be proved easily by the methods of ‘vector 
algebra’’, so easily, in fact, that it is hardly necessary to single them out by 
calling them theorems. Results which men once went to great pains to prove 
have been worked into the algebraic rules where they can be exploited 
routinely. For example, everyone knows that a great many theorems about 
triangles are proved in trigonometry and geometry. But, such theorems seem 
superfluous when it is realized that a triangle can be completely characterized 
by the simple vector equation 


ace b=c (4.6) 


From this equation various properties of a triangle can be derived by simple 
steps. For instance, by “squaring” and using the distributive rule (4.4) one 
gets an equation relating sides and angles: 


(a + b)-(a + b) 
a‘(a + b) + b-(a + b) 
aa+bb+ab+ ba. 


II 


Ge 
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Fig. 4.2. 
Product. 


Symmetry of the Scalar 


The perpendicular projections of a 
on b, and b on a give respectively 


a- b = [al cos 8, 


ae (beac) 


Fig. 4.3. Distributive rule. 


The inner product is distributive with 
respect to vector addition. Projection 
gives 


a-(b+c)=4-b+4-c. 
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and b-a=|b| cos 6, Multiply this by |a|, and use the dis- 


tributive rule for scalars. Finally use 
the relation (4.3) and |a| 4 = ato get 
the general distributive (4:4). 


which after dilation results in the sym- 
metrical form 


a:b =|bla-b=|a/b-4=b-a. 


Or, using (4.2) and (4.5), 


le]? = Jal? + |b? + 2a-b. (4.7) 


This equation can be reexpressed in terms of scalar labels still commonly used 
in trigonometry. Figure 4.4 indicates the relations 


a=la|,b=|b|,c =|e|. ab =-abdcosC. 
So (4.7) can be written 


in the form 


C=a@+bh- 


2ab cos C. (4.8) 


This formula is called the 
“law of cosines” in trig- 
onometry. If C is a right 
a angle, then cos C = 0, 
and Equation (4.8) re- 
duces to the Pythagorean * 
Theorem. 


Fig. 4.4. Scalar and vector labels for a triangle. 
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By rewriting (4.6) in the form a = c-b and squaring one gets a formula 
similar to (4.8) involving the angle A. Similarly, an equation involving angle B 
can be obtained. In this way one gets three equations relating the scalars a, b, c, 
A, B, C. These equations show that given the magnitude of three sides, or of two 
sides and the angle between, the remaining three scalars can be computed. This 
result may be recognized as encompassing several theorems of geometry. The 
point to be made here is that these results of geometry and trigonometry need 
not be remembered as theorems, since they can be obtained so easily by the 
“algebra” of scalars and vectors. 

Trigonometry is founded on the Greek theories of proportion and perpen- 
dicular projection. But the principle ideas of trigonometry did not find their 
simplest symbolic expression until the invention of vectors and the inner 
product by Grassmann. Grassmann originally defined the inner product just 
as we did by correspondence with a perpendicular projection. But he also 
realized that once the basic algebraic properties have been determined by 
correspondence, no further reference to the idea of projection is necessary. 
Thus, the “inner product” can be fully defined abstractly as a rule relating 
scalars to vectors which has the properties specified by Equations (4.2) 
through (4.5). 

With the abstract definition of a-b the intuitive notion of relative direction 
at last receives a precise symbolic formulation. The notion of number is 
thereby nearly developed to the point where the principles and theorems of 
geometry can be completely expressed by algebraic equations without the 
need to use natural language. For example, the statement “lines OA and OB 
are perpendicular” can now be better expressed by the equation 


ab = 0. (4.9) 


Trigonometry can now be regarded as a system of algebraic equations and 
relations without any mention of triangles and projections. However, it is 
precisely the relation of vectors to triangles and of the inner product to projec- 
tions that makes the algebra of scalars and vectors a useful language for 
describing the real world. And that, after all, is what the whole scheme was 
designed for. 


1-5. The Outer Product 


The algebra of scalars and vectors based on the rules just mentioned has been so 
widely accepted as to be routinely employed by mathematicians and physicists 
today. As it stands, however, this algebra is still incapable of providing a full 
expression of geometrical ideas. Yet there is nothing close to a consensus on how 
to overcome this limitation. Rather there is a great proliferation of different 
mathematical systems designed to express geometrical ideas — tensor algebra, 
- matrix algebra, spinor algebra — to name just a few of the most common. It 
might be thought that this profusion of systems reveals the richness of mathema- 
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tics. On the contrary, it reveals a widespread confusion — confusion about the 
aims and principles of geometric algebra. The intent here is to clarify these aims 
and principles by showing that the preceding arguments leading to the invention 
of scalars and vectors can be continued in a natural way, culminating in a single 
mathematical system which facilitates a simple expression of the full range of 
geometrical ideas. 

The principle that the product of two vectors ought to describe their relative 
directions presided over the definition of the inner product. But the inner 
product falls short of a complete fulfillment of that principle, because it fails to 
express the fundamental geometrical fact that two non-parallel lines determine a 
plane, or, better, that two non-collinear directed line segments determine a 
parallelogram. The possibility of giving this important feature of geometry a 
direct algebraic expression becomes apparent when the parallelogram is re- 
garded as a kind of “geometrical product” of its sides. But to make this 
possibility a reality, the notion of number must again be generalized. 

A parallelogram can be regarded as a directed plane segment. Just as 
vectors were invented to characterize the notion of a directed line segment, so 
a new kind of directed number, called a bivector or 2-vector, can be intro- 
duced to characterize the notion of directed plane segment (Figure 5.1). Like 


Fig. 5.1. Bivectors and Plane Segments. 


A bivector B can be pictured as a plane segment. Just as vectors with the same 
direction can be represented by line segments on parallel lines, so bivectors with 
the same direction can be represented by plane segments in parallel planes. 

The magnitude of B is a scalar denoted by | B |. The magnitude of B is equal to 
the area of the corresponding plane segment. The shape of the plane segment, is 
irrelevant, or rather, is not associated with any property of B. However, a circular 
shape suggests the fact that the B does not distinguish any one direction in the plane 
from any other, while a parallelogram indicates a relation of the plane segment to 
line segments. 

The orientation of the bivector (and the corresponding plane) can be indicated 
by an arrowhead assigning a “‘sense”’ to the curve bounding the plane segment. A 
bivector B and its negative, denoted by —B, can be pictured as the same figure but 
with opposite orientations. The two orientations of a plane (or bivector) are 
commonly distinguished by the words “‘clockwise”’ and “counter clockwise”’. 

Like a vector, a bivector should not be regarded as having a place. Plane 
segments with the same magnitude and direction can be regarded as different 
representations of one and the same bivector no matter where they are located; so 
they can be labeled by one and the same bivector symbol. Plane segments that do 
not lie in parallel planes must be labeled with different bivector symbols even if 
they have the same magnitudes. 
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a vector, a bivector has magnitude, direction and orientation, and only these 
properties. But here the word “direction” must be understood in a sense 
more general than is usual. Just as the direction of a vector corresponds to an 
(oriented straight) line, so the direction of a bivector corresponds to an 
(oriented flat) plane. The distinction between these two kinds of direction 
involves the geometrical notion of dimension or grade. Accordingly, the 
direction of a bivector is said to be 2-dimensional to distinguish it from the 
1-dimensional direction of a vector. And it is sometimes convenient to call a 
vector a I-vector to emphasize its dimension. Also, a scalar can be regarded as 
a Q-vector to indicate that it is a 0-dimensional number. Since, as already 
shown, the only directional property of a scalar is its orientation, orientation 
can be regarded as a 0-dimensional direction. Thus the idea of numbers with 
different geometrical dimension begins to take shape. 

In ordinary geometry the concepts of line and plane play roles of compar- 
able significance. Indeed, the one concept can hardly be said to have any 
significance at all apart from the other, and the mathematical meaning of 
‘line’ and ‘plane’ is determined solely by specifying relations between 
them. To give ‘‘planes”’ and “‘lines” equal algebraic representation, the notion 
of directed number must be enlarged to include the notion of bivector as well 
as vector, and the relations of lines to planes must be reflected in the relations 
of vectors to bivectors. It may be a good idea to point out that both line and 
plane, as commonly conceived, consist of a set of points in definite relation to 
one another. It is the nature of this relation that distinguishes line from plane. 
A single vector completely characterizes the directional relation of points in a 
given line. A single bivector completely characterizes the directional relation 
of points in a given plane. In other words, a bivector does not describe a set of 
points in a plane; rather it describes the directional property of such a set, 
which, so to speak, specifies the plane the points are ‘‘in’’. Thus, the notion of 
a plane as a relation can be separated from the notion of a plane as a point set. 
After the directional properties of planes and lines have been fully incorpo- 
rated into an algebra of directed numbers, the geometrical properties of point 
sets can be more easily and completely described than ever before, as we shall 
see. 

Now return to the problem of giving algebraic expression to the relation of 
line segments to plane segments. Note that a point moving a distance and 
direction specified by a vector a sweeps out a directed line segment. And the 
points on this line segment, each moving a distance and direction specified by 
a vector b, sweep out a parallelogram, (Figure 5.2). Since the bivector B 
corresponding to this parallelogram is clearly uniquely determined by this 
geometrical construction, it may be regarded as a kind of “product” of the 
vectors a and b. So write 


aab = B. (5.1) 


A “wedge” is used to denote this new kind of multiplication to distinguish it 
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Fig. 5.2. The “parallelogram rule” for outer multiplication. 


Note that the order of arrows on the boundary determines an orientation for the 
parallelogram. The arrows indicate the path of a point which first sweeps out a line 
segment and then, as the line segment moves, an edge of the parallelogram. 


from the “dot” denoting the inner product of vectors. The bivector aab is 
said to be the outer product of vectors a and b. 

Now note that the parallelogram obtained by “‘sweeping b along a’”’ differs 
only in orientation from the parallelogram obtained by ‘“‘sweeping a along b”’ 
(Figure 5.2). This can be simply expressed by writing 


bAa = —anb = -B. Gz) 


Thus, reversing the order of vectors in an outer product ‘“‘reverses” the 
orientation of the resulting bivector. This is expressed by saying that the outer 
product is anticommutative. 

The relation of vector orientation to bivector orientation is fixed by the rule 


baa = aa(-b) = (—b)a(-a) = (-a)ab. (5.3) 


This rule, like the others, follows from the correspondence of vectors and 
bivectors with oriented line segments and plane segments. It can be simply 
“read off” from Figure 5.3. 


Fig. 5.3. Relative orientations of vectors and bivectors. 


The same bivector is obtained from the outer product of any pair of vectors 
labeling consecutive oriented line segments bounding an oriented parallelogram. 
Note that directed line segments on opposite sides of an oriented parallelogram 
correspond to vectors of opposite orientation. 


24 Origins of Geometric Algebra 


Since the magnitude of the bivector anb is just the area of the corre- 
sponding parallelogram, 


|B| = |aab| = |baa| = |a||b| sin 8, (5.4) 


where 6@ is the angle between vectors a and b. This formula expresses the 
relation between vector magnitudes and bivector magnitudes. The relation to 
sin 6 is given in (5.4) for comparison with trigonometry; it 1s not part of the 
definition. 

Scalar multiplication can be defined for bivectors in the same way as it was 
for vectors. For bivectors C and B and scalar A, the equation 


C = AB (5.5) 
means that the magnitude of B is dilated by the magnitude of A, that is, 
|C| = |A||BI, (5.6) 


and the direction of C is the same as that of B if A is positive, or opposite to it 
if A is negative. This last stipulation can be expressed by equations for 
multiplication by the unit scalars one and minus one: 


(1)B=B, (-1)B=-B. (5.7) 


Bivectors which are scalar multiples of one another are said to be codirectional. 
Scalar multiplications of vectors and bivectors are related by the equation 


A(anb) = (da)ab = aa(Ab). (5.8) 


For A = -1, this is equivalent to Equation (5.3). For positive A, Equation 
(5.8) merely expresses the fact that dilation of one side of a parallelogram 
dilatates its area by the same amount. 

Note that, by (5.4), |aab| = 0 for nonzero a and b if and only if sin 6 = 0, 
which is a way of saying that a and b are collinear. Adopting the principle, 
already applied to vectors, that a directed number is zero if and only if its 
magnitude is zero, it follows that |aab| = 0 if and only if aab = 0. Hence, 
the outer product of nonzero vectors is zero if and only if they are collinear, 
that is, 


aanb = 0 (5.9) 
if and only if b = Aa. Note that if A # 0, (5.9) together with (5.8) implies that 
aaa = 0. (5.10) 


This is as it should be, for the anticommutation rule (5.2) implies that 
aaa = —aaa, and only zero is equal to its own negative. All of this is in 
complete accord with the geometric interpretation of outer multiplication, for 
if a and b are collinear, then ‘sweeping a along b”’ produces no parallelogram 
at all. 
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The relation of addition to outer multiplications is determined by the 
distributive rule: 


aa(b + c) = aab + aac. (5.11) 


The corresponding geometrical construction is illustrated in Figure 5.4. Note 
that (5.11) relates addition of vectors on the left to addition of bivectors on 
the right. So the algebraic properties and the geometrical interpretation of 
bivector addition are completely determined by the properties and interpre- 
tation already accorded to vector addition. For example, the sum of two 
bivectors is a unique bivector, and again, bivector addition is associative. For 


b + I cy 


aa(b + c) = aab + aac = aa(b, + cy) = anb, + aac, 


Fig. 5.4. Distributive rule for outer multiplication. 


To prove the distributive rule, express b as the sum of a part b, collinear with b + ¢ and 
a part b; orthogonal to b + c. Do the same for c and observe that by = —c,, so 
b +c = (b, + by) + (c; + cy) = by + cy and 


aa(b + c) = aa(by + ¢)) = aab, + ancy. 


This reduces the distributive to the usual rule for adding areas, since all the bivectors in 
the equation are codirectional. Thus, if b, and c, have the same orientation as a + b, 
which is the case in the above diagram, then 


|aa(b + c) | = aa(by + ¢))| = |aaby| + [ancy]. 


It may happen, however, that the orientation of ¢, is opposite to that of a + b, in which 
case the orientation of a'‘a(b + c) is opposite to that of aac,, 


aa(b + c)| = |aa(by + c)| = aaby| + | ancy]. 


Construction of a diagram corresponding to this case is left to the student. 
It should be evident from the diagram that quite generally 


|aa(b + c)| <|aab| + jaac|, 


with equality possible only if a, b and c are coplanar. The quantities | aab| and | aac/ are 
ordinary areas and can be added like any other scalars. But aab and aab are “‘directed 
areas” and add “‘like vectors”’. 
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bivectors with the same direction, it is easily seen that the distributive rule 
(5.11) reduces to the usual rule for adding areas. 

Both the inner and outer products are measures of relative direction, but 
they complement one another. Relations which are difficult or impossible to 
obtain with one may be easy to obtain with the other. Whereas the equation 
a-b = 0 provides a simple expression of ‘perpendicular’, aab = 0 provides a 
simple expression of ‘“‘parallel’’. To illustrate the point, reconsider the vector 
equation for a triangle, which was analyzed above with the help of the inner 
product. Take the outer product of a + b = c successively with vectors a, b, 
c, and use the rules (5.10) and (5.11) to obtain the three equations 


aab = aac, 
baa = bac, 
caa + cab = 0. 


Only two of these equations are independent; the third, for instance, is the 
sum of the first two. It is convenient to write the first two equations on a single 
line, like so: 


aac = aab = cab. (5-12) 


Here are three different ways of expressing the same bivector as a product of 
vectors. This gives three different ways of expressing its magnitude: 


|aac| = |cab| = |aab| . (5.13) 
Using (5.4) and the scalar labels for a triangle indicated in Figure 4.4, one 
gets, after dividing by abc, 


sin A _ sin B _ sin 2 (5.14) 


This formula is called the ‘‘law of sines” in trigonometry. We shall see in 
Chapter 2 that all the formulas of plane and spherical trigonometry can be 
easily derived and compactly expressed by using inner and outer products. 

The theory of the outer product as described so far calls for an obvious 
generalization. Just as a plane segment is swept out by a moving line segment, 
a “space segment” is swept out by a moving plane segment. Thus, the points 
on an oriented parallelogram specified by the bivector aab moving a dis- 
tance and direction specified by a vector ¢ sweep out an oriented parallel- 
epiped (Figure 5.5), which may be characterized by a new kind of directed 
number T called a trivector or 3-vector. The properties of T are fixed by 
regarding it as equal to the outer product of the bivector anb with the vector 
c. So write 


(anb)ac = T. (5215) 


The study of trivectors leads to results quite analogous to those obtained 
above for bivectors, so the analysis need not be carried out in detail. But one 
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(aAbyAc = T (aAb)A(-c) =-T 


Fig. 5.5. The ‘‘parallelepiped rule” for outer multiplication. 


Displacement of an oriented parallelogram sweeps out an oriented parallel- 
epiped. Displacement in the opposite direction sweeps out a parallelepiped with 
opposite orientation. 


new result obtains, namely, the conclusion that outer multiplication should 
obey the associative rule: 


(aab)ac = aa(bac). (5.16) 


The geometric meaning of associativity can be ascertained with the help of the 
following rule: 


(baa)ac = (-aab)ac = -T. (5217) 


This is an instance of the general rule that the orientation of a product is 
reversed by reversing the orientation of one of its factors. Repeated apphi- 
cations of (5.16) and (5.17) makes it possible to rearrange the vectors in a 
product to get 


(aab)ac = (bac)aa = (caa)ab. 


This says the same oriented parallelepiped is obtained by sweeping ‘‘aab 
along c’’, “bac along a” or “‘caa along b”. So the associative rule is needed to 
express the equivalence of different ways of ‘‘building up” a space segment 
out of line segments. 

Of course, if ¢ “‘lies in the plane of aab”, then “‘sweeping anb along c”’ does 


not produce a 3-dimensional object. Accordingly, write 
(aab)ac = anbac = 0. (5.18) 


This equation provides a simple algebraic way of saying that 3 lines (with 
directions denoted by vectors a, b, c) lie in the same plane, just as Equation 
(5.9) provides a simple way of saying that 2 lines are parallel. 

Like any other directed number, a 3-vector has magnitude, direction and 
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orientation, and only these properties. The dimensionality of a 3-vector is 
expressed by the fact that it can be factored into an outer product of three 
vectors, though this can be done in an unlimited number of ways. The 
magnitude of anbac is denoted by | aabac| and is equal to the volume of the 
parallelepiped determined by the vectors a, b and c. 

The orientation of a trivector depends on the order of its factors. The 
anticommutation rule together with the associative rule imply that exchange 
of any pair of factors in a product reverses the orientation of the result. For 
instance, caAbaa = —aabac. Thus the idea of relative orientation is very 
easily expressed with the help of the outer product. Without such algebraic 
apparatus the geometrical idea of orientation is quite difficult to express, and, 
not surprisingly, was only dimly understood before the invention of vectors 
and the outer product. 

The essential aspects of outer multiplication and the generalized notions of 
number and direction it entails have now been set down. No fundamentally 
new insights into the relations between algebra and geometry are achieved by 
considering the outer product of four or more vectors. But it should be 
mentioned that if vectors are used to describe the 3-dimensional space of 
ordinary geometry, then displacement of the trivector anbac in a direction 
specified by d fails to sweep out a 4-dimensional space segment. So write 


(aabac)ad = aabacad = 0. (5.19) 


The parenthesis is unnecessary because of the associative rule (5.16). Equa- 
tion (5.19) must hold for any four vectors a, b, c, d. This is a simple way of 
saying that space is 3-dimensional. Note the similarity in form and meaning of 
Equations (5.19), (5.18) and (5.9). It should be clear that (5.19) does not 
follow from any ideas or rules previously considered. By supposing that the 
outer product of four vectors is not zero, one is led to an algebraic description 
of spaces and geometries of four or more dimensions, but we already have 
what we need to describe the geometrical properties of physical space. 

The outer product was invented by Herman Grassmann, and, following a 
line of thought similar to the one above, developed into a complete math- 
ematical theory before the middle of the nineteenth century. His theory has 
been accorded a prominent place in mathematics only in the last forty years, 
and it is hardly known at all to physicists. Grassmann himself was the only one 
to use it during the first two decades after it was published. There are several 
reasons for this. The most important one arises from the fact that Grass- 
mann’s understanding of the abstract nature of mathematics was far ahead of 
his time. He was the first person to arrive at the modern conception of algebra 
as a system of rules relating otherwise undefined entities. He realized that the 
nature of the outer product could be defined by specifying the rules it obeys, 
especially the distributive, associative, and anticommutive rules given above. 
He rightly expounded this momentous insight in great detail. And he proved 
its significance by showing, for the first time, how abstract algebra can take us 
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beyond the 3-dimensional space of experience to a conception of space with 
any number of dimensions. Unfortunately, in his enthusiasm for abstract 
developments, Grassmann deemphasized the geometric origin and interpre- 
tation of his rules. No doubt many potential readers would have appreciated 
the geometrical applications of Grassmann’s system, but most were simply 
confounded by the profusion of unfamiliar abstract ideas in Grassmann’s long 
books. 

The seeds of Herman Grassmann’s great invention were sown by his father, 
Gunther, who in 1824, when Herman was 15, published these words in a book 
intended for elementary instruction: 


“the rectangle itself is the true geometrical product, and the construction of it is really geometrical 
multiplication. . .. A rectangle is the geometric product of its base and height, and this product 
behaves in the same way as the arithmetic product.” (italics added) 


The elder Grassmann elaborated this idea at some length and must have 
advocated it with considerable enthusiasm to his young son. As it stands, 
however, Giinther’s idea is hardly more than a novel way of expressing the 
central idea of Book II of Euclid’s Elements. The Greeks made frequent use 
of the correspondence between the product of numbers and the construction 
of a parallelogram from its base and height. For example, Euclid represented 
the distributive rule of algebra as addition of areas and proved it as a 
geometrical theorem. This correspondence between arithmetic and geometry 
was rejected by Descartes and duly ignored by the mathematicians that 
followed him. However, as already explained, Descartes merely associated 
arithmetic multiplication with a different geometric construction. The old 
Greek idea lay dormant until it was reexpressed in strong arithmetic terms by 
Ginther Grassmann. But the truly significant advance, from the idea of a 
geometrical product to its full algebraic expression by outer multiplication, 
was made by his son. 

Herman Grassmann completed the algebraic formulation of basic ideas in 
Greek geometry begun by Descartes. The Greek theory of ratio and pro- 
portion is now incorporated in the properties of scalars and scalar multiplica- 
tion. The Greek idea of projection is incorporated in the inner product. And 
the Greek geometrical product is expressed by outer multiplication. The 
invention of a system of directed numbers to express Greek geometrical 
notions makes it possible, as Descartes had already said, to go far beyond the 
geometry of the Greeks. It also leads to a deeper appreciation of the Greek 
accomplishments. Only in the light of Grassmann’s outer product is it possible 
to understand that the careful Greek distinction between number and magni- 
tude has real geometrical significance. It corresponds roughly to the distinc- 
tion between scalar and vector. Actually the Greek magnitudes added like 
scalars but multiplied like vectors, so multiplication of Greek magnitudes 
involves the notions of direction and dimension, and Euclid was quite right in 
distinguishing it from multiplication of ‘Greek numbers” (our scalars). Only 
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in the work of Grassmann are the notions of direction, dimension, orientation 
and scalar magnitude finally disentangled. But his great accomplishment 
would have been impossible without the earlier vague distinction of the 
Greeks, and perhaps without its reformulation in quasi-arithmetic terms by 
his father. 


1-6. Synthesis and Simplification 


Grassmann was the first person to define multiplication simply by specifying a 
set of algebraic rules. By systematically surveying various possible rules, he 
discovered several other kinds of multiplication besides his inner and outer 
products. Nevertheless, he overlooked the most important possibility until 
late in his life, when he was unable to follow up on its implications. There is 
one fundamental kind of geometrical product from which all other significant 
geometrical products can be obtained. All the geometrical facts needed to 
discover such a product have been mentioned above. 

It has already been noted that the inner and outer products seem to 
complement one another by describing independent geometrical relations. 
This circumstance deserves the most careful study. The simplest approach is 
to entertain the possibility of introducing a new kind of product ab by the 
equation 


ab = a-b + aab. (6.1) 


Here the scalar a:b has been added to the bivector aab. At first sight it may 
seem absurd to add two directed numbers with different grades. That may 
have delayed Grassmann from considering it. For centuries the notion that 
you can only add “‘like things” has been relentlessly impressed on the mind of 
every schoolboy. It is a kind of mathematical taboo — its real justification 
unknown or forgotten. It is supposedly obvious that you cannot add apples 
and oranges or feet and square feet. On the contrary, it is only obvious that 
addition of apples and oranges is not usually a practical thing to do — unless 
you are making a salad. 

Absurdity disappears when it is realized that (6.1) can be justified in the 
abstract ‘‘Grassmannian”’ fashion which has become standard mathematical 
procedure today. All that mathematics really requires is that the indicated 
relations and operations be well defined and consistently employed. The 
mathematical meaning of adding scalars and bivectors is determined by 
specifying that such addition satisfy the usual commutative and associative 
rules. Use of the ‘equal sign” in (6.1) is justified by assuming that it obeys the 
same rules as those governing equality in ordinary scalar algebra. With this 
understood, it now can be shown that the properties of the new product are 
almost completely determined by the obvious requirement that they be consis- 
tent with the properties already accorded to the inner and outer products. 
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The commutative rule b-a = a-b together with the anticommutative rule 
baa = —aab imply a relation between ab and ba. Thus, 


ba = b-a + baa = a:b - anb. (6.2) 


Comparison of (6.1) with (6.2) shows that, in general, ab is not equal to ba 
because, though their scalar parts are equal, their bivector parts are not. 
However, if aab = 0, then 


ab = a-b = ba. (6.3) 
And if a-b = 0, then 
ab = aab = -baa = -ba. (6.4) 


It should not escape notice that to get (6.3) and (6.4) from (6.1) the usual 
“additive property of zero” is needed, and no distinction between a scalar 
zero and a bivector zero is called for. 

The product ab inherits a geometrical interpretation from the interpret- 
ations already accorded to the inner and outer products. It is an algebraic 
measure of the relative direction of vectors a and b. Thus, from (6.3) and 
(6.4) it should be clear that vectors are collinear if and only if their product is 
commutative, and they are orthogonal if and only if their product is anticom- 
mutative. But more properties of the product are required to understand its 
significance when the relative direction of two vectors is somewhere between 
the extremes of collinearity and orthogonality. 

To give due recognition to its geometric significance ab will henceforth be 
called the geometric product of vectors a and b. 

From the distributive rules (4.4) and (5.11) for inner and outer products, it 
follows that the geometric product must obey the left and right distributive 
rules 


a(b + c) = ab + ac, (6.5) 
(b + c)a = ba + ca. (6.6) 
Equation (6.5) can be derived from (6.1) by the following steps 
a(b + c) = a-(b + c) + aa(b + c) 
(a:b + acc) + (aab + aac) 
= (ab + anb) + (ae + aac) 
= ab + ac. 


Note that the usual properties of equality and the commutative and associat- 
ive rules of addition have been employed. Equation (6.6) can be derived from 
(6.2) in the same way. The distributive rules (6.5) and (6.6) are independent 
of one another, because multiplication is not commutative. To derive them, 
the distributive rules for both the inner and outer products were needed. 

The relation of scalar multiplication to the geometric product is described 
by the equations 
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A(ab) = (Aa)b = a(Ab). (6.7) 


This is easily derived from (4.3) and (5.8) with the help of the definition (6.1). 
It says that scalar and vector multiplication are mutually commutative and 
associative. If the commutative rule is separated from the associative rule, it 
takes the simple form 


Aa = ad. (6.8) 


Now observe that by taking the sum and difference of equations (6.1) and 
(6.2), one gets 


a:b = +(ab + ba), (6.9) 
and 
aab = +(ab — ba). (6.10) 


This points the way to a great simplification. Instead of regarding (6.1) as a 
definition of ab, consider ab as fundamental and regard (6.9) and (6.10) as 
definitions of a-b and aab. This reduces two kinds of vector multiplication to 
one. It is curious, then, to note that by (6.7) the commutativity of the inner 
product arises from the commutativity of addition, and by (6.10) the anticommu- 
tativity of the outer product arises from the anticommutativity of subtraction. 

The algebraic properties of the geometric product of two vectors have 
already been ascertained. It should be evident that the corresponding proper- 
ties of the inner and outer products can be derived from the definitions (6.9) 
and (6.10) simply by reversing the arguments already given. 

The next task is to examine the geometric product of three vectors a, b,c. It 
is certainly desirable that this product satisfy the associative rule 


a(be) = (ab)c = abc, (6.11) 


for that greatly simplifies algebraic manipulations. But it must be shown that 
this rule leads to results in accordance with the inner and outer products. This 
can be done by examining the product of a vector with a bivector. 


The product aB of a vector a with a bivector B can be expressed as a sum of 
“symmetric” and “antisymmetric” parts in the following way 


aB = +(aB + aB) + +(Ba — Ba) 
= +(aB —- Ba) + +(aB + Ba). 
Anticipating results to be obtained, introduce the notations 
a‘B = +(aB — Ba) = —-B-a (6.12) 
aaB = +(aB + Ba) = Baa, (6.13) 
So 
aB = a-B + aaB. (6.14) 
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As the notation indicates, aAB is to be regarded as identical to the outer 
product of vector and bivector which has already been introduced for geo- 
metric reasons. The quantity a-B is something new; as the notation suggests, 
it is to be regarded as a generalization of the inner product of vectors. 

Note that (6.13) differs from (6.10) by a sign, because (6.13) has a bivector 
where (6.10) has a vector. The sign in (6.13) is justified by showing that (6.13) 
yields the properties already ascribed to the outer product. To this end, 
it 1s sufficient to show that (6.13) implies the associative rule (aab)ac = 
aa(bac), since all the other basic properties of the outer product have 
already been ascertained. Of course the properties of the geometric product, 
including the associative rule (6.11) must be freely used in the proof. Utilizing 
the definitions (6.10) and (6.12) for the outer product, 


(aab)ac = +[+(ab — ba)c + ¢+(ab — ba)] 
+[abe — bac + cab — cba]. 


Similarly, 


+[as (be — cb) + +(be — cb)a] 
= +[abe — acb + bea — cba]. 


aa(bac) 


On taking the difference of these expressions several terms cancel and the 
remaining terms can be arranged to give 
(anb)ac — aa(bac) = —+ b(ac + ca) + 7(ca + ac)b 
= —+b(a-e) + +(acc)b = 0. 
Note that the fact that the vector inner product is a scalar is needed in the last 
step of the proof. 


Now to understand the significance of a:B, let B = bac. Use the definitions 
as before to eliminate the dot and wedge: 


a:(bac) = +[a +(be — cb) — +(be — cb)al 
= +[abe — acb — bea + cbal. 
To this, add 
= +[bac — cab — bac + cab], 


and collect terms to get 


a-(bac) = ;[(ab + ba)c — (ac + ca)b — b(ca + ac) + c(ba + ab)] 
= +[2(a-b)e — 2(a-c)b — b2(ac) + c2(b-a)] 
= (a:b)c — (a-c)b. 
Thus 
a:(bac) = (a-b)c — (a:e)b. (6.15) 


This shows that inner multiplication of a vector with a bivector results in a 
vector. So Equation (6.14) expresses the fact that the geometric product aB of 
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a vector with a bivector results in the sum of a vector a-B and a trivector aaB. 
The similarity between (6.1) and (6.14) should be noted. They illustrate the 
general rule that outer multiplication by a vector “raises the dimension” of 
any directed number by one, whereas inner multiplication ‘‘lowers™ it by one. 
Clearly, the generalized inner and outer products provide an algebraic vehicle 
for expressing geometric notions about “increasing or decreasing the dimen- 
sion of space’. 


1-7. Axioms for Geometric Algebra 


Let us examine now what we have learned about building a geometric 
algebra. To begin with, the algebra should include the graded elements 
0-vector, l-vector, 2-vector and 3-vector to represent the directional proper- 
ties of points, lines, planes and space. We introduced three kinds of multipli- 
cation, the scalar, inner and outer products, to express relations among the 
elements. But we saw that inner and outer products can be reduced to a single 
geometric product if we allow elements of different grade (or dimension) to 
be added. For this reason, we conclude that the algebra should include 
elements of “mixed grade”’, such as 


AeA A Ar A (7.1) 


Before continuing, a note about nomenclature is in order. The term 
“dimension” has two distinct but closely related mathematical meanings. To 
separate them we will henceforth use the term ‘‘grade”’ exclusively to mean 
“dimension” in the sense that we have used the term up to this point. We 
have preferred the term ‘“‘dimension” in our introductory discussion, because 
it is likely to have familiar and helpful connotations for the reader. However, 
now we aim to improve the precision of our language with an axiomatic 
formulation of the basic concepts. The alternative meaning for ‘‘dimension”’ 
will be explained in Section 2.2, where for k = 0, 1, 2,3, A, is an element of 
grade k called k-vector part of A. Thus, (7.1) presents A as the sum of a scalar 
A,, a vector A,, a bivector A,, and a trivector A,. We refer to A as a 
multivector, a (directed) number or a quantity. Any element of the geometric 
algebra can be called a multivector, because it can be represented in the form 
(7.1). For example, a vector a can be expressed trivially in the form (7.1) by 
writing A = A, = a, A, = A, = A, = 0 and using the property a + 0 = a. 
Note the k-vectors which are not scalars are denoted by symbols in boldfact 
type. Such k-vectors are sometimes called k-blades or, simply, blades to 
emphasize the fact that, in contrast to 0-vectors (scalars), they have ‘‘direc- 
tional properties’. 

Now another simplification becomes possible. It will be noted that the 
geometric product of vectors which we have just considered has, except for 
commutivity, the same algebraic properties as scalar multiplication of vectors 
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and bivectors. In particular, both products are associative and distributive 
with respect to addition. Rather than regard them as two different kinds of 
multiplication, we can regard them as instances of a single geometric product 
among different kinds of multivector. Thus, scalar multiplication is the geo- 
metric product of any multivector by a special kind of multivector called a 
scalar. The special geometric nature of the scalars is expressed algebraically 
by the fact that they commute with every other multivector. 

Thus, we define addition and multiplication of multivectors by the following 


familiar rules: For multivectors A, B, C, .. ., addition is commutative, 

Aa B= B+ A; (72) 
addition and multiplication are associative, 

(A #B) #C =A (B + OC), 8) 

(AB)C = A(BOQ); (7.4) 
multiplication is distributive with respect to addition, 

A(B + C) = AB + AC, (7.5) 

(B + C)A = BA + CA. (7.6) 
There exist unique multivectors 0 and 1 such that 

A+0=A, (7.7) 

1A =A. (7.8) 
Every multivector A has a unique additive inverse —A, that is, 

A + (-A) = 0. (729) 


Of course, the whole algebra is assumed to be algebraically closed, that is, the 
sum or product of any two multivectors is itself a unique multivector. 

It is hardly necessary to discuss the significance of the above axioms, since 
they are familiar from the elementary algebra of scalars. They can be used to 
manipulate multivectors in exactly the same way that numbers are manipu- 
lated in arithmetic. For example, axiom (7.9) is used to define subtraction of 
arbitrary multivectors in the same way that subtraction of vectors was defined 
by Equation (3.10). 

To complete our system of axioms for geometric algebra, we need some 
axioms that characterize the various kinds of k-vectors. First of all, we assume 
that the set of all scalars in the algebra can be identified with the real 
numbers, and we express the commutivity of scalar multiplication by the 
axiom 


AA = AA (7.10) 


for every scalar A and multivector A. Vectors are characterized by the 
following axiom. The ‘‘square” of any nonzero vector a is a unique position 
scalar | a|’, that is, 
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v=l|a 


2>0. (7.11) 


We characterize k-vectors of higher grade by relating them to vectors. It will 
be most convenient to do that after we introduce a couple of definitions. 
For a vector a and any k-blade A, we define the inner product by 


a‘A, = +(aA, — (-1)*A;, a), (7.12a) 
and the outer product by 
anA, = +(aA, + (-1)* A, a), (7.12b) 


adding these equations, we get 
aA, = a‘A, ae anA,. (7li2c) 


Note that (7.12a) includes (6.7) and (6.10) as special cases, while (7.12a) 
includes (6.8) and (6.11), and (7.12c) includes (6.1) and (6.12). 

Using the definitions (7.12a) and (7.12b), we adopt the following prop- 
ositions as axioms: 


a:A, is a (k-1)-vector (7.13a) 
anA, is a (k+1)-vector. (7.13b) 


Thus, the inner product lowers the grade of a k-vector, while the outer 
product raises the grade. According to (7.1), however, all k-vectors with 
grade k = 4 must vanish. To assure this, we need one more axiom: For every 
vector a and 3-vector A,, 


anA, = 0 (7.14a) 
By virtue of the definition (7.12b), this can alternatively be written 
aA, = A,a. (7.14b) 


In other words, vectors always commute with trivectors. 

Finally, to assure that the whole algebraic system is not vacuous, we must 
assume that nonzero multivectors with all grades k < 3 actually exist. 

This completes our formulation of the axioms for geometric algebra. We 
have neglected some logical fine points (e.g., Exercise 7.1), but our axioms 
suffice to show exactly how geometric algebra generalizes the familiar algebra 
of scalars. We have chosen a notation for geometric algebra that is as similar 
as possible to the notation of scalar algebra. This is a point of great impor- 
tance, for it facilitates the transfer of skills in manipulations with scalar 
algebra to manipulations with geometric algebra. Let us note exactly how the 
basic operations of scalar algebra transfer to geometric algebra. 

Axioms (7.2) to (7.9) implicitly define the operations of addition, subtrac- 
tion and multiplication. Except for the absence of a general commutative law 
for multiplication, they are identical to the axioms of scalar algebra. There- 
fore, multivectors can be equated, added, subtracted and multiplied in 
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exactly the same way as scalar quantities, provided one does not reorder 
multiplicative factors which do not commute. Division by multivectors can be 
defined in terms of multiplication, just as in scalar algebra. But we need to 
pay special attention to notation on account of noncommutivity, so let us 
consider the matter explicitly. 

In geometric algebra, as in elementary algebra, the solution of equations is 
greatly facilitated by the possibility of division. We can divide by a multivec- 
tor A if it has a multiplicative inverse. The inverse of A, if it exists, is denoted 
by A™ or 1/A and defined by the equation 


A“A = 1. (7.15) 


We can divide any multivector B by A in two ways, by multiplying it by A ' on 
the left, 


1 
A’B=—B, 
B 48 


or on the right, 


Obviously. the “‘left division” is not equivalent to the “‘right division’ unless 
B commutes with A ', in which case the division can be denoted unambig- 
uously by B/A. 

Every nonzero vector a has a multiplicative inverse. To determine the 
inverse of a, we multiply the equation a 'a = 1 on the right by a and divide by 
the scalar a’ = | a |?; thus 


cig — (7.16) 


With due regard for the order of factors, many tricks of elementary algebra, 
such as “rationalizing the denominator’, are equally useful in geometric 
algebra. It should be noted, however, that some multivectors do not have 
multiplicative inverses (see Exercise 7.2), so it is impossible to divide by 
them. 


7-1. Exercises 


Hints and solutions for selected exercises are given at the back of the book. 
(an) The axioms given in this chapter do not suffice to prove the elementary 
(a) Addition Property of Equality: 


lee = Cythen A+ B= AC, 
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(7.2) 


(7.3) 
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(b) Multiplication Property of Equality: 
If B = C, then AB = AC. 


Add these properties to our list of axioms and prove the converses 
(c) Cancellation Principle of Addition: 


IfA+ B=A+C, then B=C. 
(d) Cancellation Principle of Multiplication: 
If AB = AC and A" exists, then B = C. 


Specify the justification for each step in the proof. The proofs in 

geometric algebra are identical to those in elementary algebra. 

Let A = a + a where ais a scalar and a is a nonzero vector. 

(a) Find A ' as a function of a and a. What conditions on a and a 
imply that A™ does not exist? 

(b) Show that if A“ does not exist, then M can be normalized so 
that 


A? =A. 


A quantity with this property is said to be idempotent. 

(c) Show that if A # 1 is idempotent, its product with any other 
multivector is not invertible. It can be proved that every multi- 
vector which does not have an inverse has an idempotent for a 
factor. 

(d) Find an idempotent which does have an inverse. 

Prove that every left inverse is also a right inverse and that this 

inverse is unique. 


Chapter 2 


Developments in Geometric Algebra 


In Chapter 1 we developed geometric algebra as a symbolic system for 
representing the basic geometrical concepts of direction, magnitude, orien- 
tation and dimension. In this chapter we continue the development of 
geometric algebra into a full-blown mathematical language. The basic gram- 
mar of this language is completely specified by the axioms set down at the end 
of Chapter 1. But there is much more to a language than its grammar! 

To develop geometric algebra to the point where we can express and 
explore the ideas of mechanics with fluency, in. this chapter we introduce 
auxiliary concepts and definitions, derive useful algebraic relations, describe 
simple curves and surfaces with algebraic equations, and formulate the 
fundamentals of differentiation and integration with respect to scalar vari- 
ables. Further mathematical developments are given in Chapter S. 


2-1. Basic Identities and Definitions 


In Chapter 1 we were led to the geometric product for vectors by combining 
inner and outer products according to the equation 


ab = a‘b + anb. (1.1) 


Then we reversed the procedure, defining the inner and outer products in 
terms of the geometric product by the equations 


a-b = +(ab + ba) | (E22) 
aab = +(ab — ba). | (173) 


This did more than reduce two different kinds of multiplication to one. It 
made possible the formulation of a simple axiom system from which an 
unlimited number of geometrical relations can be deduced by algebraic 
manipulation. In this section we aim to improve our skills at carrying out such 
deductions and establish some widely useful results. 

The inner and outer products appear frequently in applications, because 
they have straightforward geometrical interpretations, as we saw in Chapter 


oo 


40 Developments in Geometric Algebra 


1. For this reason, it is often desirable to operate directly with inner and outer 
products, even though we regard the geometric product as more fundamen- 
tal. To make this possible, we need a system of algebraic identities relating 
inner and outer products. We derive these identities, of course, by using the 
geometric product and the axioms of geometric algebra set down in Section 
1-7. 
The different products are most easily related by the equation 
aA, = a:A, + anA,, (1.4) 


which generalizes (1.1) to apply to any r-blade A,, that is, any r-vector with 
grade r > 0. Recall that the corresponding definitions of inner and outer 
products are given by 


a-A, = +(a@A(-1)’ A,a) = (-1)"*' Aa (3) 

anA, = +(aA, + (-1)’ A,a) = (-1)’A,aa. (1.6) 

We will make frequent use of the fact that a-A, is an (r — 1)-vector (a scalar if 
r = 1), while avA, is an (r + 1)-vector. 

To illustrate the use of (1.4) and its special case (1.1), let us derive the 


associative rule for the outer product. Beginning with the associative for the 
geometric product, 


a(be) = (ab)c, 
we use (1.1) to get 
a(b-c + bac) = (a’b + aab)e. 
Applying the distributive rule and (1.4), we get 
a(b-c) + a-(bac) + aa(bac) = (a-b)e + (aab)-c + (aab)ac. 


Now we identify the terms a(b-c) and a-(bac) as vectors, and the term 
aa(bac) as a trivector. Since vectors are distinct from trivectors, we can 
separately equate vector and trivector parts on each side of the equation. By 
equating trivector parts, we get the associative rule 


aa(bac) = (aab)ac. (1.7) 


And by equating the vector parts we find an algebraic identity which we have 
not seen before, 


a(b-c) + a:(bac) = (a‘b)e + (anb):c. (1.8) 


For more about this identity, see Exercise 1. 

This derivation of the associative rule (1.5) should be compared with our 
previous derivation of the same rule in Section 1-6. That derivation was 
considerably more complicated, because it employed a direct reduction of the 
outer product to the geometric product. Moreover, the indirect method 
employed here gives us the additional ‘‘vector identity” (1.6) at no extra cost. 
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Note the general structure of the method: an identity involving geometric 
products alone is expanded into inner and outer products by using (1.4) and 
then parts of the same grade are separately equated. This will be our principal 
method for establishing identities involving inner and outer products. As 
another example, note that the method immediately gives the distributive 
rules for inner and outer products. Thus, if a is a vector and B, and C, are 
r-blades, then by applying (1.4) to 


a(B, + C,) = aB, + aC,, 
and separating parts of different grade, we get 

a-(B, + C,) = a-B, + a-C,, (1.9a) 
and 

an(B, + C,) = aaB, + aaC,. (1.9b) 


Here we have the distributive rules in a somewhat more general (hence more 
useful) form than they were presented in Chapter 1. 

These examples show the importance of separating a multivector or a 
multivector equation into parts of different grade. So it will be useful to 
introduce a special notation to express such a separation. Accordingly, we 
write (A), to denote the r-vector part of a multivector A. For example, if 
A = abc, this notation enables us to write 


(abe), = aabac 
for the trivector part, 
(abc), = (a‘b)c + a:(bac) 


for the vector part, while the vanishing of scalar and bivector parts is 
described by the equation 


(abe), = 0 = Cabe),. 
According to axiom (7.1) of Chapter 1, every multivector can be decom- 
posed into a sum of its r-vector parts, as expressed by writing 
A= 2 <A), Ga <A), ag <A), ye Ay. F ADS: (1.10) 


If A = <A),, then A is said to be homogeneous of grade r, that is, A is an 
r-vector. A multivector A is said to be even (odd) if <A>, = 0 when r is an 
even (odd) integer. Obviously every multivector A can be expressed as a sum 
of even part (A), and an odd part <A)_. Thus 


A = (A), + (A). (1.11a) 
where 
(Ape= Adan t <A?., (1.11b) 


(A). = (A), + (A). (1.11) 


42 Developments in Geometric Algebra 


We shall see later that the distinction between even and odd multivectors is 
important, because the even multivectors form an algebra by themselves but 
the odd multivectors do not. 

According to (1.10), we have (A>, = 0 for all k > 3, that is, every blade 
with grade k > 0 must vanish. We adopted this condition in Chapter 1 to 
express the fact that physical space is three dimensional, so we will be 
assuming it in our treatment of mechanics throughout this book. However, 
such a condition is not essential for mathematical reasons, and there are other 
applications of geometric algebra to physics where it is not appropriate. For 
the sake of mathematical generality, therefore, all results and definitions in 
this section are formulated without limitations on grade, with the exception, 
of course, of (1.10) and (1.11b, c). This generality is achieved at very little 
extra cost, and it has the advantage of revealing precisely what features of 
geometric algebra are peculiar to three dimensions. 

Before continuing, it will be worthwhile to discuss the use of parentheses in 
algebraic expressions. Note that the expression a-be is ambiguous. It could 
mean (a:b)c, which is to say that the inner product a-b is performed first and 
the resulting scalar multiplies the vector c. On the other hand, it could mean 
a:(be), which is to say that the geometric product be is performed before the 
inner product. The two interpretations give completely different algebraic 
results. To remove such ambiguities without using parentheses, we introduce 
the following perference convention: If there is ambiguity, indicated inner and 
outer products should be performed before an adjacent geometric product. 
Thus 


(AAB)C = AaBC + Aa(BC), (1.12a) 
(A-B)C = A-BC + A-(BC). (1.12b) 


This convention eliminates an appreciable number of parentheses, especially 
in complicated expressions. Other parentheses can be eliminated by the 
convention that outer products have “preference” over inner products, so 


A:(BAC) = A-BAC # (A-B)aC, (1.13) 


but we use this convention much less often than the preceding one. 
The most useful identity relating inner and outer products is, of course, its 
simplest one: 


a:‘(bac) = abe - a-cb. (1.14) 


We derived this in Section 1-6 before we had established our axiom system for 
geometric algebra. Now we can derive it by a simpler method. First we use 
(1.2) in the form ab = —ba + 2a-b to reorder multiplicative factors as follows: 


abe = —bac + 2a-be = —b(-ca + 2a-c) + 2a-be. 
Rearranging terms and using (1.1), we obtain 


abe — a:cb = +(abe — bea) = +(abac — baca), 
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which, by (1.5), gives us (1.14) as desired. 
By the same method we can derive the more general reduction formula 


a:(baC,) = a-bC, — ba(a:C,), (1.15) 


where a and b are vectors while C, is an s-blade. 
We use (1.2) and (1.5) to reorder multiplicative factors as follows: 


abC, = -baC, + 2a-bC, = —b((-1)’ C, a + 2a-C,) + 2a-bC,. 
Rearranging terms and using (1.4) and (1.5), we obtain 
a-bC, — ba-C, = +[a(bC,) + (-1)’ (bC,)a] = a-(b-C,) + a-(baC,). 


The r-vector part of this equation gives us (1.15) as desired. 
By iterating (1.15), we obtain the expanded reduction formula 


r 
a(ajaajA...Aa,) = 1 Cl). aa AG ops Aa; 
= @'8,8,Aa,A... Aa, — a°8,8,Aa,A... Aa, + 


tien + (Hl) BBA ADA «> OA: (1.16) 


The inverted circumflex in the product a,a .. . a . . . Aa, means that the kth 
factor a, is to be omitted. Equation (1.16) determines the inner product of the 
r-vector A, = a,A... Aa, with a in terms of its vector factors a, and their 
inner products a-a,. It would be quite appropriate to refer to (1.15) as the 
Laplace expansion of the inner product, because of its relation to the expan- 
sion of a determinant (see Chapter 5). 

Our definitions (1.5) and (1.6) for inner and outer products require that 
one of the factors in the products be a vector. It will be useful to generalize 
these definitions to apply to blades of any grade. For any r-blade A, and 
s-blade B,, the inner product A,-B, is defined by 


A,B, = ¢A,B,) r-s| > (1.16) 
and the outer product A,aB, is defined by 
AVAB. = (AGB) ae: (1.17) 


Thus, the inner product produces an | r — s | —vector, while the outer product 
produces an (r + s)—vector. The symbol = denotes a definition or identity. 

The reduction of an inner product between two blades can be accomplished 
by using the formula 


(A,Ab)-C, = A,-(b-C,), (1.18) 


which holds for 0<r<s, with A, = (A,),, b= <b), and C, = <C,),. 
Note that the factor b-C, on the right side of (1.18) can be further reduced by 
(1.16) or (1.16) if C, is expressed as a product of vectors. Equation (1.18) can 
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be proved in the same way as (1.8). Use (1.4) to expand (A,b)C, = A,(bC,) 
and ascertain that the (s — r — 1)-vector part is equivalent to (1.18). 

The student should beware that the geometric product of blades A and B 1s 
not generally related to inner and outer products by the formula 


AB = A-B + AaB 


unless one of the factors is a vector, as in (1.4). In particular, this formula 
does not hold if both A and B are bivectors. To prove that, express A as a 
product of orthogonal vectors by writing A = aab = ab. 


Then 
AB = a(b-B + baB) 
= a:(b-B) + aa(b-B) + a:-(bAB) + anbaB. 
Hence 
AB = A-B + (AB), + AaB, (1.19) 
where 


A-B = (AB), = a:(b-B), 
<AB>, = aa(b-B) + a:(baB), 
AaB = (AB), = aabaB. 


Note that we have 3 terms in (1.19) in contrast to the two terms in (1.4). To 
learn more about the product between bivectors, we use the trick that any 
geometric product can be decomposed into symmetric and antisymmetric 
parts by writing 

AB = +(AB + BA) + 3(AB - BA). 


Comparing this with (1.19), it is not difficult to establish that, for bivectors A 
and B, 


A-B + AaB = 5(AB + BA) = B-A + BAA (1.20a) 
<AB), = (AB- BA) = —<BA),. (1.20b) 


The expression +(AB — BA) is sometimes called the commutator or commu- 
tator product of A and B, because it vanishes if A and B commute. Equation 
(1.20b) tells us that the commutator product of bivectors produces another 
bivector. Equation (1.20a) tells us that the symmetric product of bivectors 
+(AB + BA) produces a scalar A-B and a 4-vector AAB. Of course, we can 
take AAB = 0 when we employ our grade restriction axiom as in (1.10). 

The procedure which gave us the expansion (1.19) for the product of 
bivectors can be applied to the product of blades of any grade. Note, that for 
the product A,B, = a,a, .. . a,B,, if r < s, we reduce the initial grade of the 
factor B, on the right by successive inner products with vectors, and the term 
of lowest grade will be 
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a,:(a,° Seer (a,-B,)) = <A,b,),-, = APB; 


Thus, we conclude that A,B, is the term of lowest grade in the product A,B,. 
And as a useful corollary, we note that the product A,B, can have a nonzero 
scalar part A,B, only if r = s. 


Factorization 


We know the great importance of factoring in scalar algebra. In geometric 
algebra we have a new kind of factoring which is equally important, namely, 
the factoring of a k-blade into a product of vectors. Consider, for example, a 
unit vector a and a nonzero bivector B such that 


anB = 0. (1.21) 
By virtue (1.4), this implies that 
aB = a B=b, G2) 


which defines a vector b. We can solve this equation for B by division, that is, 
by multiplying it by a’ = a. Thus, we obtain 


B = ab = anb. (1223) 


The last equality follows from (1.1), which tells us that a-b = 0. Equation 
(1.23) is a factorization of the bivector B into a product of orthogonal vectors, 
so (1.21) is a condition that a be a factor of B. Of course, b is also a factor of B 
and baB = 0. Equation (1.22) shows that b is a unique factor of B orthogonal 
to a. In Section 2-2 it will become obvious that B can be factored into 
orthogonal vector pairs in an infinite number of ways. Blades of higher grade 
can be factored in a similar way. 


Reversion 


In algebraic computations, it is often desirable to reorder the factors in a 
product. For this reason, it is convenient to introduce the operation of 
reversion defined by the equations 


(AR — 3 A (1.24a) 
(Aci) =Al- B", (1.24b) 
CAD = ee (1.24c) 
a =a) it 3a. (1.24d) 


We say that A’ is the reverse of the multivector A. It follows easily from 
(1.24a) and (1.24d) that the reverse of a product of vectors is 


(Gaeta) a en (1.25) 


This justifies our choice of the name “‘reverse”’. 
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The reverse of a bivector B = anb is given by 
B= (anh) = pAa—=anb — 5. (1.26) 


where the anticommutivity of the outer product was used to reorder vector 
factors. In a similar way we can reorder vector factors in a trivector one at a 
time. 


cAbaAa = —bacaa = baaac = —anbac. 
Hence, 
(aabac)! = cabaa = —anbac. Gie27) 


Thus, reversion changes the signs of bivectors and trivectors, while, according 
to (1.24c) and (1.24d), it does not affect scalars and vectors. All this is 
summed up by applying reversion to a general multivector in the expanded 
form (1.10), with the result 


At= (Apo a <A), — (A), — A). (1.28) 


Magnitude 


To every multivector A there corresponds a unique scalar | A |, called the 
magnitude or modulus of A, defined by the equation 


[PAdl== <ANAD Ye, (1.29) 
Existence of the square root is assured by the fact that 

|A |? =<AtA), = 0, (1.30) 
where | A | = 0 if and only if A = 0. To prove this, first observe that 

was... iy |? owe. comes)" (ae oc a) Sia: |?. 9 apie e 0; “Gap 
If the vectors here are orthogonal, then they are factors of an r-blade 
aja,...a, = a,aa,na...aa,. It follows that the squared modulus of any 


non-zero r-blade is positive, that is, 
| <A),|? = 0 for any grade r. (1.32) 


Now, when we expand the product A‘A in terms of r-vector parts, the “‘cross 
terms” multiplying blades of different grades have no scalar parts, so they can 
be ignored when we take scalar parts. Thus from (1.10) we obtain the 
expansion 


| A ? = CATA) =| (AD, P +] CAD, P +1 CAD P+ | <ADgI?. (1.33) 


None of the terms in this expansion can be negative according to (1.32), so 
(1.30) is proved. 
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2-1. Exercises 


(1.1) 


(a2) 


(1.3) 


(1.4) 


(1.5) 


(1.6) 


G7) 


(1.8) 


(1.9) 


(1.10) 
(1.11) 


(1.12) 


Establish the following ‘vector identities”: 

(a) (aab):(cad) = b-ca-d — b-da-c = b-(cad)-a. 

(b) a:(bacad) = a-bead — a:cbad + a-dbac. 

(c) (uav):(aabac) = (uav)-(aab)e — (uav):(aac)b 
+ (uav)-(bAc)a. 


Vectors a, b, ¢ are said to be linearly dependent if there exists scalars 
a, B, y (not all zero) such that aa + Bb + yc = 0. Prove that 
aabac = 0 if and only if a, b, c are linearly dependent. Express the 
coefficients a, B, y for linearly dependent vectors in terms of the 
inner products of the vectors. 

Solve the following vector equation for the vector x: 


ox + axb=c. 
In the following vector equation B is a 2-blade, 

ox + x'B= a. 
Solve for the vector x. A good plan of attack in this kind of problem 
is to eliminate inner and/or outer products in favor of geometric 
products, so one can ‘‘divide out” multiplicative factors. 
Solve the following simultaneous equations for the vector x under 
the assumption that c-a # 0: aax = B, cx = a. 
Prove the related vector identities 

b? a-c = a-bb-c + (aab)-(bac), 

<anbeab), = (aabac):b. 
Reduce (anbac)-(uavaw), and (abuy>, to inner products of vec- 
tors. 
The identity ab = —ba + 2a-b can be used to reorder vectors in a 
product. Use it to establish the expansion formula 


<abcuvw), = a:b<cuvw>, — a:c(buvw>, + a-u(beyw>, — 


—~a-v<bcuw), + a'w<beuv),. 
Prove that for vectors a,, 


{a,a,...a,, = Oif r + 5 is an odd integer. 


Prove the identity (1.15). 
Establish the ‘“‘Jacobi identity” for vectors: 


a:(bac) + b-(caa) + e-(aab) = 0 


Prove that if A, is an r-blade, then A;' = 


er 
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(1513)  Baewe Gi we Gab) = Ely zy 
and <AB (bi CB AD 

(1.14) Prove that (AB), = (BA). 

(1.15) Prove that the bivector aab + cad can be factored into a product of 
vectors if and only if aabacad = 0. 

(1.16) Prove that 


(AB), = (A), (BD, + SAD (BD. 
LAB = <(AyL S874 KAP AB), 


Use this to prove that the product of even (or odd) multivectors is 
always even. 

(1.17) Define A-B and AaB for arbitrary multivectors A and B so that the 
distributive properties of inner and outer products are preserved. 
Show that if Equation (1.4) were to be generalized to 
aA = a:A+ aaA for arbitrary A, then it would be necessary to 
require that a‘A = 0 and aaA = aad for scalar A. 


2-2. The Algebra of a Euclidean Plane 


Every vector a determines an oriented line, namely, the set of all vectors 
which are scalar multiples of a. Thus, every vector x on the line is related to a 
by the equation 


xX = aa. (2-1) 


This is said to be a parametric equation for the a-line. Each value of the scalar 
parameter a determines a unique point x on the line. A vector x is said to be 
positively directed (relative to a) if x-a > 0, or negatively directed if x-a < 0. 
This distinction between positive and negative vectors is called an orientation 
(or sense) of the line. The unit vector 4 = a | a | ' is called the direction of the 
oriented line. The opposite orientation (or sense) for the line is obtained by 
reversing the assignments of positive and negative to vectors, that is, by 
designating —a as the direction of the line. If a distinction between the two 
possible orientations is not made, the line is said to be unoriented. 
Outer multiplication of (2.1) by a gives the equation 


xaa = 0. (2.2) 


This is a nonparametric equation for the a-line. The a-line is the solution set 
{x} of this equation. Note that we use curly brackets { } to indicate a set. 
To show that, indeed, every solution of (2.2) has the form (2.1), use (1.1) 
to obtain from (2.2), the equation 


Xa = X‘a. 


Multiplying this equation on the right by a’ = aa* and writing a = x-aa’ 
= xa’, one gets (2.1) as promised. 
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Two Dimensional Vector Space 


There is an algebraic description of a plane which is quite analogous to that of 
a line. Given a (nonzero) bivector B, the set of all vectors x which satisfy the 
equation 


xAB = 0 (2.3) 
is said to be a 2-dimensional vector space and may be referred to as the 
B-plane. An orientation for the B-plane is determined by designating a unit 


bivector i proportional to B as the direction of the plane. Obviously, if the 
relation 


B= Bi (2.4) 
is substituted in (2.3), the scalar B can be divided out to get the equivalent 
equation 

xai = 0. Cs 
So every bivector which is a non-zero scalar multiple of i determines the same 
plane as i. Such a bivector is called a pseudoscalar of the plane. 


A parametric equation for the i-plane can be derived from (2.5) by 
factoring i into the product 


i= 6,0, = 6,A0, =-6,6,, (2.6) 
where o, and a, are orthogonal unit vectors, that is, o,;0, = 0 and a) = of 
= 1. Using (1.4), we obtain from (2.5), 

xi = x-i, (2.7) 
or, by (2-6) and (1.16); 

X0,6, = x'(6,AG6,) = X'6,0, — X°6,6,. 

Multiplying this on the right by if = ¢,0,, we obtain 
X= X,0, + Xo, (2.8) 


where x, = x‘a, and x, = x-o,. Equation (2.8) is a parametric equation for 
the i-plane. Scalars x, and x, are called rectangular components of the vector x 
with respect to the basis {o,, ¢,}. Equation (2.8) determines a distinct vector 
x for each distinct pair of values of the components. A typical vector x is 
represented by a directed line segment in Figure 2.1a. Orthogonal vectors like 
o, and a, are represented by perpendicular line segments in the figure, and the 
unit pseudoscalar i is represented by a plane segment. To be precise, i is the 
directed area of the plane segment, and the directed area of every plane 
segment in the i-plane is proportional to i. 


Geometric Interpretations of a Bivector 


The unit bivector i has two distinct geometric interpretations corresponding 
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to two basic properties 
of the plane. First, as 
already mentioned, it 
is the unit of directed 
area, or simply, the di- 
rection of the plane. 
Second, as we shall 
see, it is the generator 
of rotations in the 
plane. The first inter- 
pretation is exemp- 
lified by Equation 
(2.6), which expresses 
the fact that the unit of 
area i can be obtained 
from the product of 
two orthogonal units 
of length o, and o,.  Fig.2.1. Diagram of the i-plane of vectors, commonly known as 
Equations exemplify- the “real plane”. 
ing the second inter- 
pretation can be obtained by multiplying (2.6) on the left by o, and a, to get 
o,i= o,, (2.9a) 
oi = -o,. (2.9b) 
According to (2.9a), multiplication of o, on the right by i transforms oa, into 
o,. Since o, has the same magnitude as o, but is orthogonal to it, this 
transformation is a rotation of @, through a right angle. Similarly, Equation 


(2.9a) expresses the rotation of @, through a right angle into —o,. 
Substitution of (2.9a) into (2.9b) gives 


(Ga) = «1 = -c,, 


which expresses the fact two consecutive rotations through right angles reverses 
the direction of a vector. This provides a geometric interpretation for the 
equation 


? = -1, (2.10) 


when i and -1 are both regarded as operators (by multiplication) on vectors. 

Right multiplication by i of any vector x in the i-plane rotates x by a right 
angle into a vector x’ = xi. From (2.8) and (2.9) we can get the components 
of x’ from the components of x; thus, 


x’ = xi = x,6,-x,6,. (2701) 


The relation of x to x’ is represented in Figure 2.1. Notice that multiplication 
by i rotates vectors counterclockwise by a right angle. It is conventional to 
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correspond a positive 
orientation (or sense) 
of a plane (and its unit 
pseudoscalar) with a 
counterclockwise rota- 
tion, as indicated in 
Figure 2.1. A negative 
orientation (or sense) 
then corresponds to a 
clockwise rotation. 
Another common way 
to distinguish between 
the two orientations of 
a plane is with the 
terms “left turn” and 
“right turn”’. 


— pseudoscalar axis 


Plane Spinors 


Fig. 2.2. Diagram of the Spinor i-plane commonly known as an rane 
“argand diagram of the complex plane’’. Each point in the spinor By multiplying muever 
plane represents a rotation-dilation. Points on the unit circle tors in the i-plane, we 
represent pure rotations, while points on the positive scalar axis get a quantity called a 


represent pure dilations. spinor of the i-plane. 
For example, by (2.6) 
and (2.8), from the product of vectors 6, and x, we get a spinor z in the form 


Z=¢6x%=x, +ix,- (2.12) 


Quantities of the form x, + ix, are commonly called complex numbers. It will 
be convenient for us to adopt that terminology when we wish to emphasize 
some relation to traditional concepts. However, it must be remembered that 
besides the property i° = -1 ascribed to the tradition unit imaginary, our iis a 
bivector, so it has geometric and algebraic properties beyond those tradition- 
ally accorded to “imaginary numbers’. The real and imaginary parts of a 
complex number z are commonly denoted by Xe{z} and An{z}. Separation of 
a complex number into real and imaginary parts is equivalent to separating a 
spinor into scalar and pseudoscalar (bivector) parts, that is, 


z+z! 


x, = Re{z} = (zp, = ae (2.13a) 
7 Me, em r 
ty eee) > a (2.13b) 


Note that reversion of a spinor corresponds exactly to conventional complex 
conjugation, that is, 
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7 = xo. = — a (2.14) 
Also, our notation | z | agrees with the conventional notation for the modulus 
of a complex number, thus, 


[eS el Se ee ; (2.15) 


The set of all spinors of the form (2.12) is a 2-dimension space, which is 
aptly called the spinor plane (or spinor i-plane, to emphasize the special role 
of i). The elements of the spinor plane can be represented by directed line 
segments or points in a diagram such as Figure 2.2. Comparison of Figures 
2.la and 2.1b shows that the points of the i-plane of vectors can be put into 
one-to-one correspondence with points of the spinor plane. A correspon- 
dence is determined by an arbitrary choice of some vector in the vector plane, 
say o,; then the product of @, with each vector x is a unique spinor Z, as is 
expressed by equation (2.12). Conversely, each spinor z determines a unique 
vector x according to the equation 


X = 6,2, (2.16) 


derived from (2.12). Note that o, distinguishes a line in the vector plane to be 
associated with the scalar axis in the spinor plane. 

In spite of the correspondence between the vector and spinor planes, each 
plane has a different geometric significance, just as their elements have 
different algebraic properties. The distinction between the two planes corre- 
sponds to the two distinct interpretations of i. The interpretation of i as a 
directed area is indicated in the Figure 2.1 for the vector plane. On the other 
hand, the operator interpretation of i as a rotation (of vectors) through a right 
angle is indicated in the Figure 2.2 by the right angle that the i-axis makes 
with the scalar axis. This observation leads to an operator interpretation for 
all the spinors, and justifies calling i the generator of rotations. Consider, in 
particular, the interpretation of Equation (2.16) and its representation in 
Figures 2.1 and 2.2. Operating on a, (by right multiplication), the spinor z 
transforms oa, into a vector x. As indicated in the figures, this transformation is 
a rotation of o, through some angle @ combined with a dilation of o, by an 
amount | z | . Our choice of o, was arbitrary, so z evidently has the same 
effect on every vector in the i-plane. Thus, each spinor can be regarded as an 
algebraic representation of a rotation-dilation. This connection of spinors with 
rotations provides some justification for the terminology “‘spinor”’. Further 
justification comes from the fact that our use of the term here and in a more 
general sense later on is consistent with established use of the term ‘‘spinor”’ 
in advanced quantum mechanics. 


The Algebra of a Plane 


Some of our language from this point on will be simpler and clearer if we take 
a moment to introduce a few general concepts and definitions. Any expression 
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of the form a,A, + a,A, +... + a,A, with scalar coefficients a, is called a 
linear combination of the quantities A,, A,,...,A,. The A, are linearly 
dependent if some linear combination of the A, with at least one nonzero 
coefficient is identically zero. Otherwise, they are linearly independent. If the 
A, (k=1, ..., 1), are linearly independent, then the set of all linear combi- 
nations of the A, is said to be an n-dimensional linear space, and the set {A,} 
is said to be a basis for that space. 

Returning to the particular case at hand, we recall that in connection with 
Equation (2.8) the vectors o, and a, have already been identified as compris- 
ing a basis for the i-plane. Now, when we multiply vectors o, and o,, we 
generate the unit scalar | = 6; = @; and the unit pseudoscalari = 6,0, but no 
other new quantities. It follows that every multivector A which can be 
obtained from vectors of the i-plane by addition and multiplication can be 
written as a linear combination 


A =a, + a,6, + 2,0, + a,i, (2.17) 


with scalar coefficients a,, a,, a, and a,. We call the set of all such multi- 
vectors the (Geometric) Algebra of the i-plane or simply the i-algebra, and we 
denote it by &, (i). We suppress the i and write S, when we do not wish to 
refer to a particular plane. The subscript 2 here refers both to the grade of the 
pseudoscalar and the dimension of the plane. 

It is clear from (2.17) that any multivector A in 9, (i) can be expressed as 
the sum of a vector a = ao, + a,e, and a spinor z = a, + a,i, that is 


A=atz. (2.18) 


The vectors are odd, while the spinors are even multivectors. Accordingly, we 
can express &, as the sum of two linear spaces, 


Gaerne. (2.19) 


where (©; is the 2-dimensional space of vectors and ‘; is the 2-dimensional 
space of spinors. Being the sum of two 2-dimensional spaces, the algebra ©, is 
itself a 4-dimensional linear space. The four unit multivectors 1, o,, 6, and 
i = o,¢, make up a basis for this space. 


A Distinction between Linear Spaces and Vector Spaces 


A comment on nomenclature is in order here. In most mathematical litera- 
ture the term ‘vector space” is synonymous with “linear space’. This is 
because any quantities that can be added and multiplied by scalars are 
commonly called vectors. However, geometric algebra ascribes other proper- 
ties to vectors, in particular, that they can be multiplied in a definite way. So 
we restrict our use of the term “‘vector” to the precise sense we have given it 
in geometric algebra. Accordingly, we restrict our use of the term vector space 
to refer to a linear space of vectors. We continue to use the term “linear 
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space”’ in its usual more general sense. Thus both %; (i) and Y}; (i) can be 
called linear spaces of two dimensions, but of the two, only &; (i) can be 
called a vector space, and Y, (i) is a 4-dimensional linear space but not a 
vector space. 

The 2-dimensional vector space Y, (i) is, of course, the i-plane itself under 
another name. To describe the fact that every nonzero vector a in this space 
has a positive square a’ = a-a = | a |’, this vector space is said to be Euclid- 
ean. For, as we have seen in Chapter 1, Euclidean geometry can be'repre- 
sented algebraically when the magnitude of a vector is interpreted as the 
length of a line segment. Accordingly, we may write 


G3 (i) = €, (2.20) 


to express the fact that the i-plane is a Euclidean plane, that is, a 2-dimension 
Euclidean vector space «,. And we may refer to the i-algebra ©, as the algebra 
of a Euclidean plane. 

One other comment about nomenclature is in order here. In Chapter 1 it 
was mentioned that a distinction between the concepts of dimension and 
grade must be made. That distinction should be clear by now. The dimension 
of a linear space is the number of linearly independent elements in the space. 
This definition of dimension is well established in mathematics, so we have 
adopted it. On the other hand, the concept of grade derives from a concept of 
vector multiplication producing new entities distinguished by grade. For a 
vector space and its algebra, the concepts of dimension and grade are closely 
related. We have seen that a vector space of dimension 2 is determined by a 
pseudoscalar of grade 2 and vice-versa. It is not difficult to show that a vector 
space of any finite dimension v is similarly related to a pseudoscalar of grade n. 


2-3. The Algebra of Euclidean 3-Space 


The concept of a 3-dimensional Euclidean space ©, is fundamental to physics, 
because it provides the mathematical structure for the concept of physical 
space. Moreover, the properties of physical space are presupposed in every 
aspect of mechanics, not to mention the rest of physics. For this reason we 
cultivate the geometric algebra of ©, as the basic conceptual tool for rep- 
resenting and analyzing geometrical relations in physics. 

We can analyze the algebra of ¢’, in the same way that we analyzed the 
algebra of ©,. Let 1 be a unit 3-blade. The set of all vectors x which satisfy the 
equation 


XAi = 0 


is the Euclidean 3-dimensional vector space ¢,. Scalar multiples of i are called 
pseudoscalars of this vector space, and we refer to i as the unit-pseudoscalar. 


The Algebra of Euclidean 3-Space 55 


Note that the symbol ‘‘’” is an exception to our convention that k-blades be 
represented in boldface type. We make this exception to emphasize the 
singular important of 7 and to distinguish it from unit 2-blades which we have 
represented by i. The set of all multivectors generated from the vectors of ¢’, 
by addition and multiplication is the (geometric) algebra of €,, and it will be 
denoted by &, or &,(i). One way to study the structure of &, is by constructing 
a basis for the algebra. 

Because of (3.1), we can factor i into a product of three orthonormal 
vectors: thus, 


i = 6,0,0, = 6,A0,/o;. (2) 


The term “orthonormal” means orthogonal and normalized to unity. The 
normalization of vectors o,, o,, a, is expressed by 


a1 =o05=—05=1. (3.3a) 


The orthogonality of the vectors is expressed by the equations 

o;0,=0 if iFy, (3.3b) 
or equivalently, by 

6;6,=6;,A0,;=-0,6, if iF#j, (3.4) 


with i, j = 1, 2, 3 understood. 

We further assume that o,, 0,, 6, make up a righthanded or dextral set of 
vectors. The term “‘righthanded”’ actually concerns the interpretation of %, 
rather than some intrinsic property of 
the algebra. This interpretation 
arises from the correspondence of 
vectors with directions in physical 
space as indicated in Figure 3.1. As 
the figure shows, a righthanded screw 
pointing in the o, direction will ad- 
vance in that direction when given a 
counterclockwise rotation in the 
(o,0,)-plane. Equation (3.21) speci- 
fies a definite relation of the pseudo- 
0, scalar i to the righthanded set of 
vectors, which we express in words 


Fig. 3.1. Orthonormal basis for 9. Directed ome 7 
line segments represent unit vectors. Directed by saying the 7 is the dextral or right- 


plane segments represent unit bivectors. The handed unit pseudoscalar. By revers- 
oriented cube represents the unit pseudoscalar i. ing the directions of the o,, we get a 

lefthanded set of vectors, {-6,} and 
the lefthanded unit pseudoscalar (—¢,) (-o.) (—6,) = -1. The terms “right- 
handed” and “‘lefthanded”’ distinguish the two possible orientations of a 
3-dimensional vector space, just as the terms “counterclockwise” and “‘clock- 
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wise” distinguish the two possible orientations of a plane. 
Every vector x in ©, is related to the o, by an equation of the form 


X=X,0, + X.0, + X,9;, (3.5) 


as represented in Figure 
3.2. The scalars x, = x-o, 
are called rectangular 
components of the vec- 
tor x with respect to the 
basis {o,, ¢,, o,}. Equa- 
tion (3.5) can be derived 
as follows; from (3.1), 
we deduce 


xi = X'l, 
but by (3.2) and (1.16), Fig. 3.2. Rectangular components of a vector. 
xi = X0,0,0, = x'(0,AG,A0,) = X°0,6,AG, — X'0,0,AG, + X'0,6,A0,. 


Multiplying this on the right by i? = o,0,0, and using (3.3a) and (3.4) we get 
(3.5) as expected. 


Bivectors of €, 


Inspection of (3.4) shows that by multiplication of the a; we obtain exactly 
three linearly independent bivectors, namely 


i, = 0,6, = 1G, 
i, = 0,6, =i¢,, (3.6) 
i, = 6,0, =ia,. 


The last equality in (3.6) was obtained by multiplying (3.2) successively by o,, 
o, and o,. Note that the three equations (3.6) differ only by a cyclic permu- 
tation of the indices 1, 2, 3. 

Since the i, are the only bivectors which can be obtained from the o, by 
multiplication, any bivector B in &, can be expressed as the linear combination 


B= B,i, + B,i, + B,i, (3.7) 
with scalar coefficients B,. Thus, the set of all bivectors in &, is a 3- 
dimensional linear space with a basis {i,, i,, i,}. Now, by substituting (3.6) 


into (3.7), we find that every bivector B is uniquely related to a vector 
b = B,o, + B,o, + Bo, by the equation 


B = ib. (3.8) 


This relation is expressed in words by saying that the bivector B is the dual of 
the vector b. In general, we define the dual of any multivector A in G, to be its 
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product iA with the dextral unit pseudoscalar. 

From (3.2) and (3.5) it should be clear that every trivector in “is 
proportional to 7. It follows, then, from (3.1) that “, contains no nonzero 
k-vectors with k = 4. Therefore, by axiom (1.11), every multivector A in &, 
can be expressed in the expanded form 


A= (Ao - (A), ar <A), a {A)). (339) 


Introducing the notations a = (A), and a = (A), for the scalar and vector 
parts of A, and expressing the bivectors and pseudoscalar parts of A as duals 
of a vector and a pseudoscalar by writing (A), = ib and <A), = iB, we can 
put (3.9) in the form 


A= @-1p-r ab: (3.10) 


This multivector has a one scalar component, 3 vector components, 3 bivector 
components and one pseudoscalar component. Thus, %, is a linear space with 
1+ 3+3+ 1+ = 8 dimensions. Asa basis for that space we may use the 8 
unit multivectors {1, 6,, io,, i}, with k = 1,2, 3 understood. The subspace of 
k-vectors in &, can be denoted by (“%),; thus, (&%), is a 3-dimensional 
space of vectors, ¢“,), is a 3-dimensional space of bivectors and (“%), is a 
1-dimensional space of trivectors. 


The Pseudoscalar of ©, 


Although we established some important properties of &, in the course of 
determining a basis for the algebra, reference to a basis was not at all 
necessary, and we shall avoid it in the future except when it is an essential part 
of the problem at hand. In computations with ©, the pseudoscalar : plays a 
crucial role, so we list now its basic properties: 


it =-i, (3.11a) 
id (3.11b) 
iA = Ai for every A in S;,, (3.11c) 
aanbac = di (3:-Tid) 


for any vectors a, b, c in “,; the scalar A is positive if and only if the vectors 
make up a righthanded set in the order given. Properties (3.11la, b) follow 
from the fact that 7 is a 3-blade normalized to unity, Property (3.11d) 
obviously generalizes (3.2). Properties (3.1la, c) are both consequences of 
Equation (3.1), but, conversely, Equation (3.1) can easily be derived from 
them. 


Complex Numbers 


The symbol i has been chosen for the unit pseudoscalar, because the proper- 
ties (3.11a, b, c) are similar to those usually attributed to “the square root of 
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minus one” in mathematics. We have seen, however, that there is not just one 
root of minus one in geometric algebra, but many. By inspection of the 
elements in a basis of the algebra, we see that in “, there are two distinct 
kinds of solutions to the equation 


x =-l; 


either x is a pseudoscalar, whence x = +/, or x is a bivector such asi, = 0,¢,. 
To get a unique bivector solution of the equation we must have some 
information which determines the plane of the bivector, such as the directions 
of two noncollinear vectors in the plane. 

Complex numbers are widely used in mathematical physics. To translate a 
specific application of complex numbers into geometric algebra, it is neces- 
sary to identify the geometrical role that the V—I tacitly plays in the applica- 
tion. Usually, it will be found that the V-I can be associated with some 
plane in physical space, so it should be interpreted as a bivector. Whenever 
this is done, the physical significance of the mathematical apparatus becomes 
more transparent, and the power of the theory is enhanced. This will be 
demonstrated by many examples in the rest of the book. In some applications 
of complex number to physics it is not so easy to attribute some physicogeo- 
metrical significance to the V—1. Our experience with geometric algebra then 
suggests that there must be a better way to formulate a problem. 


Quaternions are Spinors 


We have seen that the algebra of complex numbers appears with a geometric 
interpretation as the subalgebra ©’; of even multivectors in %,. Similarly, we 
can express &, as the sum of an odd part G; and even part Gj, that is, 


Go = os oe 


According to (3.10) then, we can write a multivector A in the form 


TO ac (3.12a) 
where 
(A). =a + if, (3.12b) 
(A), =a +t ib, (3.12c) 


As is easily verified, “; is closed under multiplication, so it is a subalgebra of 
“4,, though %; is not. For this reason ©; is sometimes called the even 
subalgebra of %,. But it may be better to refer to “%; as the spinor algebra or 
subalgebra, to emphasize the geometric significance of its elements. Just as 
every spinor in %; represents a rotation-dilation in 2-dimensions, so every 
spinor in “7; represents a rotation-dilation in 3-dimensions. The represen- 
tation of rotations by spinors is discussed fully in Chapter 5. 

Equation (3.12c) shows that each spinor can be expressed as the sum of a 
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scalar and a bivector. In view of (3.7), then, the four quantities 1, i,, i,, i, 
make up a basis for “; . Thus ©; is a linear space of 4 dimensions. For this 
reason, the elements of “; were called quaternions by William’ Rowan 
Hamilton, who invented them in 1843 independently of the full geometric 
algebra from which they arise here. Following Hamilton, we may also use the 
name Quaternion Algebra for G}. 

Quaternions are well known to mathematicians today as the largest pos- 
sible associative division algebra. But few are aware of how quaternions fit in 
the more general system of geometric algebra. Some might say that the 
quaternion algebra is actually distinct from the spinor algebra “’; , that these 
algebras are not identical but only isomorphic. But such a distinction only 
serves to complicate mathematics unnecessarily. The identification of qua- 
ternions with spinors is fully justified not only because they have equivalent 
algebraic properties, but more important, because they have the same geo- 
metric significance. Hamilton’s choice of the name quaternion is unfortunate, 
for the name merely refers to the comparatively insignificant fact that the 
quaternions compose a linear space of four dimensions. The name quaternion 
diverts attention from the key fact that Hamilton had invented a geometric 
algebra. Hamilton’s work itself shows clearly the crucial role of geometry in 
his invention. Hamilton was consciously looking for a system of numbers to 
represent rotations in three dimensions. He was looking for a way to describe 
geometry by algebra, so he found a geometric algebra. 

Hamilton developed his quaternion algebra at about the same time that 
Herman Grassmann developed his ‘‘algebra of extension” based on the inner 
and outer products. In spite of the fact that both Hamilton and Grassmann 
eventually came to know and admire one another’s work, for several decades 
neither of them could see how their respective geometric algebras were 
related. It was only late in his life that Grassmann realized that Hamilton’s 
quaternions can be derived simply by adding his inner and outer products to 
get the geometric product ab = a-b + anb, but it was too late for him to 
pursue the implications of this insight very tar. At about the same time, the 
English mathematician W. K. Clifford independently realized that Hamilton 
and Grassmann were approaching one and the same subject from different 
points of view. By combining their algebraic ideas, he was led, in 1876, to the 
geometric product. Unfortunately, death claimed him before he was able to 
fully delineate the rich mixture of geometric and algebraic ideas he dis- 
covered, and no successor appeared to continue his work with the same depth 
of geometric insight. Consequently, the mathematical world continued to 
regard Grassmann’s and Hamilton’s algebras as independent systems. Di- 
vided, they fell into relative disuse. 

Quaternions today reside in a kind of mathematical limbo, because their 
place in a more general geometric algebra is not recognized. The prevailing 
attitude toward quaternions is exhibited in a biographical sketch of Hamilton 
by the late mathematician E. T. Bell. The sketch is titled “An Irish Tragedy”, 
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because for the last twenty years of his life, Hamilton concentrated all! his 
enormous mathematical powers on the study of quaternions in, as Bell would 
have it, the quixotic belief that quaternions would play a central role in the 
mathematics of the future. Hamilton’s judgement was based on a new and 
profound insight into the relation between algebra and geometry. Bell’s 
evaluation was made by surveying the mathematical literature nearly a 
century later. But union with Grassmann’s algebra puts quaternions in a 
different perspective. It may yet prove true that Hamilton looking ahead saw 
further than Bell looking back. 

Clifford may have been the first person to find significance in the fact that 
two different interpretations of.number can be distinguished, the quantitative 
and the operational. On the first interpretation, number is a measure of “Show 
much” or “how many” of something. On the second, number describes a 
relation between different quantities. The distinction is nicely illustrated by 
recalling the interpretations already given to a unit bivector i. Interpreted 
quantitatively, i is a measure of directed area. Operationally interpreted, i 
specifies a rotation in the i-plane. Clifford observed that Grassmann devel- 
oped the idea of directed number from the quantitative point of view, while 
Hamilton emphasized the operational interpretation. The two approaches are 
brought together by the geometric product. Either a quantitative or an 
operational interpretation can be given to any number, yet one or the other 
may be more important in most applications. Thus, vectors are usually 
interpreted quantitatively, while spinors are usually interpreted operation- 
ally. Of course the algebraic properties of vectors and spinors can be studied 
abstractly with no reference whatsoever to interpretation. But interpretation 
is crucial when algebra functions as a language. 


The Vector Cross Product 


Vector algebra, as conceived by J. Willard Gibbs in 1884, is widely used as the 
basic mathematical language in physics textbooks today, so it is important to 
show that this system fits naturally into G,. The demonstration is easy. We 
need only introduce the vector cross product a X b defined by the equation 


a Xb = -ianb, 
or, equivalently, 
aab = ia X b. 313) 


Thus, a X b is the vector dual to the bivector aab. As shown in Figure 3.3, 
the sign of the duality is chosen so that the vectors a, b, a X b, in that order, 
form a righthanded set. This agrees with our convention for the handedness 
of the pseudoscalar /, for by comparing (3.13) with (3.6), we see that 


o, X o,=6,. (3.14) 


To remember the correc: sign in the duality relation (3.13), it is helpful to 
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note that the geometric product of vectors in ©, can be written 
ab = a:b + anb = ab + ia X b, (3.15) 


and (3.13) can be obtained from this by separately equating bivector parts. 
Finally, note that by squaring (3.13) we deduce 


(a X b)? = -(anb)? = | aab |’. (3.16) 


Hence, the magnitude (a X b) is equal to the area of the parallelogram in 
Figure 3.3, which was identified as | anb | in Section 1-5. 

At this point, a caveat is in order. 

axb Books on vector algebra commonly make 

a distinction between polar vectors and 

axial vectors, with a X b identified as an 

axial vector if a and b are polar vectors. 

This confusing practice of admitting two 

b kinds of vectors is wholly unnecessary. 

An “axial vector’’ is nothing more than a 

bivector disguised as a vector. So with 

bivectors at our disposal, we can do with- 


a out axial vectors. As we have defined it in 

Fig. 3.3. Duality of the cross product (3.13), the quantity a X b is a vector in 

and the outer product. exactly the same sense that a and b are 
vectors. 


The ease with which conventional vector algebra fits into ©, isno accident. 
Gibbs constructed his system from the same ideas of Grassmann and Hamilton 
that have gone into geometric algebra. By the end of the 19th century a lively 
controversy had developed as to which system was more suitable for the work 
of theoretical physics, the quaternions or vector algebra. A glance at modern 
textbooks shows that the votaries of vectors were victorious. However, 
quaternions have reappeared disguised as matrices and proved to be essential 
in modern quantum mechanics. The ironic thing about the vector-quaternion 
controversy is that there was nothing substantial to dispute. Far from being in 
opposition, the two systems complement each other and, as we have seen, are 
perfectly united in the geometric algebra “%,. The whole controversy was 
founded on the failure of everyone involved to appreciate the distinction 
between vectors and bivectors. Indeed, the word “‘vector” was originally 
coined by Hamilton for what we now call a bivector. Gibbs changed the 
meaning of the word to its present sense, but no one at the time understood 
the real significance of the change he had made. 


2-3. Exercises 


In the following exercises and throughout the book, the symbols o, and 1 
always have the meanings assigned to them in this chapter. Also, unless 
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otherwise indicated, we will assume that every multivector is an element of 

G,. The perceptive reader will be able to identify those instances where such 

an assumption is unnecessary. 

(3.1) (a) Prove that a vector x is in the B-plane if and only if xB = —Bx 
(b) Prove that x’ = xB is a vector with the properties | x’ | = 

| B || x | and x-x’ = 0. 

(3.2) If a and b are vectors in the plane of i = ¢@,0,, show that the ratio of 

aab to i is equal to the determinant 


ao bo 
: ‘| =a-o, b-a, — b-a, aca, 


ao, b-oa, 
(333) Prove the following important identities: 
a-b = -i[aa(ib)]. 
b X a = i(anb) = a-(ib) = —(ib)-a. 
a:(bac) = —a X (b X c) = a-be — ach. 
aa(bAc) = ia:(b X c) = +(abe — cba). 


Note that the first identity expresses the inner product in terms of the outer 

product and two duality operations. With the help of the remaining identities, 

any result of conventional vector algebra can easily be derived from the more 

powerful results for inner and outer products established in Section 2.1. 

(3.4) Reexpress the identities of Exercise (1.1!) in terms of the dot and 
cross products alone. 

(3.5) Use an identity in Exercise (1.1) to prove that 


Gs sds aa Dy a, dD, 
aos DS o,- C.4: C., 
a, b, a, b, 


a, BD, 


where a, = a-o,, b, = b-o,. 
(3.6) From Equation (3.10), show that 


At = a-~iB + a— ib, 
[AP = aot +a + Bb’. 


(Gas The quaternions can be defined as the set of quantities Q of the 
form 


O=0Q, + Qi, + Qt. + Qi. 


where the Q, (k = 0, 1, 2, 3) are scalar coefficients and the i, satisfy 
the equations 


i= = =-1, 

ii,i, = 1. 
Show that the bivector basis given by Equations (3.6) has these 
properties. 


Hamilton used the symbols i, j, k instead of i,, i,, i, and wrote 
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(3.8) 


(3.9) 


down the famous equations 

P=fP=k =-l, 

ijk = —1. 
Of what geometrical significance is the difference in sign between 
this last equation and the corresponding equation above? 


The expansion of a vector b in terms of its components b, = b-a, is 
commonly expressed by any of the notations 


3 
b SS b,6;, => b,6;, == SS b,6;, =< bie, ar b,o, ah b,o,. 
k =| 


The most abbreviated form b,@, employs the so called summation 
convention, which calls for summation over all allowed values of a 
repeated pair of indicies in a single term. By this convention, the 
expansion of a bivector B in terms of components 


B; = o;'B-o; = (o,a0;)'B = -B; 


can be written 


B= 7B,6;A0; = B,,6,.A0, + B,, ¢,A0, + B,,0,A0,. 
Show that the duality relation 
B = ib 


can be expressed in terms of components by the equations 
B 


i = EyDn, 


where €;, is defined by 


5,AG,AG; i 
ea - UG/AG)AG,.. 


0 if any pair of indicies have the same value, then 


Note that &;, 


1 if {i, j, k} is an even permutation of {1, 2, 3}, 
—1 if {i, j, k} is an odd permutation of {1, 2, 3}. 


i 


Ein = E123 
Ei = 21 


Also prove that 
a xX b= AD Eij~OK 5 
aabac 
=o = Ej A;D Cy. 
Prove 


0,a0, = —a, 
o,anbo, = baa, 
o,anbaco, = 3anbac. (sum over k) 
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(3.10) Solve the following vector equation for x: 


ox +b Xx =a. 


2-4. Directions, Projections and Angles 


In Chapter | we saw how first the inner and outer products and finally the 
geometric product were invented to give algebraic expression to the geo- 
metric concept of direction. We are now prepared to express the primitive 
relations among directions in the simplest possible algebraic terms. Having 
done so, we will be able to analyze complex relations among directions by 
straightforward computation. 

The geometric product provides us with an algebraic measure of relative 
direction. Indeed, the geometric product ab was specifically designed to 
contain all the information about the relative directions of vectors a and b. 
Part of this information can be extracted by decomposing ab into symmetric 
(scalar) and antisymmetric (bivector) parts according to the fundamental 
formula 


ab = a-b + anb. (4.1) 


The fact that the product ab is indeed a direct measure of the relative 
directions of vectors a and b follows from the interpretations associated with 
a‘b and aab in chapter |. Accordingly, vectors a and b are collinear if and 
only if ab = ba, and they are orthogonal if and only if ab = —ba. In general, 
ab has an intermediate “degree of commutativity” and, hence, describes a 
relative direction somewhere between these two extremes. 

Given a vector b, any vector a can be resolved into a vector a collinear 
with b and a vector a, orthogonal to b. Explicit algebraic expressions for this 
resolution are obtained from Equation (4.1) by dividing by the vector b; thus, 


a=a +a, (4.2) 
where 
a, = a‘bb", (4.2b) 
a, = aabb' = (aab)-b". (4.2c) 


These relations are represented in slightly different ways in Figures (4.1a) and 
(4.1b). The collinearily and orthogonality properties are expressed by the 
equations 


a,b = ab = ba,, (4.3a) 
a,b = aab = -ba, . (4.3b) 


Note that (4.3b) expresses the directed area aab as the product of the 
“altitude” a, and ‘“‘base”’ b of the (a, b)-parallelogram. 


Directions, Projections and Angles 65 


a b a, b 
Fig. 4.1. (a) Fig. 4.1. (b) 


Our considerations are easily generalized as follows. In preceding sections, 
we have seen that a k-blade B determines a k-dimensional vector space called 
B-space. The relative direction of B and some vector a is completely charac- 
terized by the geometric product 


aB = a:b + aaB. (4.4) 


The vector a is uniquely resolved into a vector a in B-space and a vector a, 
orthogonal to B-space by the equations 


a=a,+a,, (4.5a) 
where 

a, = P,(a) =a BB", (4.5b) 

ay = Py (ay= anes. (4.5c) 


Besides the case k = 1 which we have already considered, we are most 
interested in the case kK = 2, when Bis a bivector. The latter case is depicted 
in Figure 4.2. 

The vector a, determined by Equa- 
tion (4.5b) is called the projection of a 
into B-space, while a, determined by 
(4.5c) is called the rejection of a from 
B-space. The new term “rejection” 
has been introduced here in the ab- 
sence of a satisfactory standard name 
Fig. 4.2. Projection and Rejection of a vec- for this important concept. Although 
tor a by a bivector B. we will not make much use of it, the 

notation P,(a) has been introduced in 
(4.5b) to emphasize that the projection is a function (or operator) P, depending 
on the blade B with the value a, when operating on a. This function is explicitly 
defined in terms of the geometric product by Equation (4.5b). Similarly, the 
rejection Ps is a function determined by Equation (4.5c). 

From Equations (1.5) we get the generalization of Equations (4.3a, b): 


a B=a-B = (-1)**’Ba, (4.6a) 
a, B = anB = (-1)‘Ba,. (4.6b) 
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For the important case k = 2, these equations imply that a vector is in the 
B-plane if and only if it anticommutes with B, and it is orthogonal to the 
B-plane if and only if it commutes with B. 

The conventional notion of ‘‘a direction’’ is given a precise mathematical 
representation by “‘a unit vector’, so it is often convenient to refer to the unit 
vectors themselves as directions. An angle is a relation between two direc- 
tions. To give this relation a precise mathematical expression, let 6 be the 
angle between directions a and b. The sine and cosine of the angle are 
defined, respectively, as the components of the rejection and projection of 
one direction by the other, as indicated in Figures (4.3a, b). These relations 
can be expressed by the equations 


b, = aa-b = acos 0, (4.7a) 

b, = aanb = aisin 6, (4.7b) 
or, more simply, by the equations 

a:b = cos 8, (4.8a) 

ab = isin 6, (4.8b) 


where iis the unit blade of the aab-plane. Equations (4.8a, b) are just parts of 
the single fundamental equation 


ab = e'? (4.9) 
where 
e'? = cos 8 + isin 6 (4.10) 


For the time being, Equation (4.10) can be regarded as a definition of the 
exponential function e'”. 

So far, cos 6 and sin @ are not definite functions of the angle 6, because we 
have not specified a definite measure for the angle. Two measures of angle are 
in common use, ‘“‘degree”’ and ‘“‘radian’’. We will employ the radian measure 
almost exclusively, because, as will be seen, the degree measure is not 
compatible with the fundamental definition of the exponential function. 

In Equation (4.9), we interpret 6 as the radian measure of the angle from a 
to b, that is, the numerical magnitude of 6 is equal to the length of arc on the 
unit circle from a to b, as indicated in Figures (4.3a, b). The common 
convention of representing angles by scalars like 6 fails to represent the fact 
that angles refer to planes, in the present case, the plane containing vectors a 
and b. This deficiency is remedied by representing angles by bivectors, so the 
angle from a to b is represented by the bivector 


6 = id. (4.11) 


Here, i specifies the plane of the angle, while | @ | specifies the magnitude’ of 
the angle. Note that the sign of @ in (4.11) depends on the orientation 
assigned to the unit blade i. In Figures (4.3a, b) the orientation was chosen so 
that @ is positive. 
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al 
0 
0 
sin 0 
cos 0 _4 
a fe a 
(a) 
; : (b) 
Fig. 4.3a, b. Linear and angular measure, b 


The fact that angles are best represented by bivectors suggests that the 
magnitude of an angle would be better interpreted as an area than as an arc 
length. As shown in Figure 4.4, the angle @ is just twice the directed area of 
the circular sector between a and b. This can be ascertained from the simple 
proportion 

area of sector area ofcircle ti 
arclength -~ circumference 27° 


The radian (arc length) mea- 
sure of angle is so well estab- 
lished that it is hardly worth 
0 changing, especially since it is 
related to the area of a circular 
sector by a mere factor of two. 
But it will be seen that the 
‘areal measure” plays a more 
direct role in applications to 
geometry and physics. 

Quite apart from the interpretation of @ = i@ as the angle (in radian 
measure) from a to b, Equation (4.9) should be regarded as a functional 
relation of the bivector @ to the vectors a and b. In Section 2.2 we called the 


b 


Fig. 4.4. Angle and area of a circular sector. 


quantity 
z=ab=e (4.12) 
a spinor of the i-plane, and noted that each such spinor (with | z | = 1) 


determines a rotation in the plane. The spinor z rotates each vector a in the 
i-plane into a vector b according to the equation 

betaz =e". (4.13) 
Thus, the exponential function e'” represents a rotation in i-plane as a 
function of the angle of rotation. 

The operational interpretation of e’’ as a rotation enables us to write down 
several important properties of the exponential function immediately, with- 
out appeal to the algebraic definition to be given in the next section. To begin 
with, we saw in Section 2-2 that a rotation through a right angle (with radian 
measure 27/4 = z/2) is represented by the unit bivector i. Hence, 
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eit? = 5, (4.14a) 


Similarly, a rotation through two right angles (measure zr) reverses direction, 
hence we have the famous formula 


et =], (4.14b) 


which relates several remarkable constants of elementary mathematics. Ro- 
tation through four right angles is equal to the identity transformation rep- 
resented by the ‘“‘multiplicative identity” 1. Hence, 


et =], (4.14c) 


The fact that a rotation through an angle i@ followed or preceded by a 
rotation through an angle id is equivalent to a rotation through an angle 
i(@ + ¢) is expressed by the equation 


Cet Vere, (4.15) 
Consequently, for n rotations through an angle i we get de Moivre’s theorem: 
(ey = el". (4.16) 


Clearly the exponential function is a great aid to the arithmetic of rotations. 


Plane Trigonometry 


Now that the basic relations of vectors to angles and rotations in a plane have 
been established, all the standard results of plane trigonometry follow by 
simple algebraic manipulations. Trigonometry can therefore be regarded as 
an elementary part of geometric algebra. 

A central problem of trigonometry is the determination of all numerical 
relations among the sides and angles of an arbitrary triangle. Consider a 
triangle with sides of length a, b, c and angles of measure a, f, y as shown in 
Figures 4.5a. The relations among the sides and angles are completely 
determined by the algebraic representation of the triangle as a vector equation 


c 
b 
a 
Fig. 4.5a. Scalar labels for a triangle. Fig. 4.5b. Vector labels for a triangle. 
a+b+c=0, (4.17) 


with | a| = a.| b| = band|c , = c (Figure 4.5b). According to (4.12), the 
angles are related to the vectors by the equations 
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~ba = bae’'’, (4.18a) 
cb = che", (4.18b) 
—ac = ace’. (4.18c) 


Of course, each of these three equations supplies us with two separate 
equations when separated into scalar and bivector parts; for example, from 
(4.18a) we get 


—b-a = bacos y, (4.19a) 
and 
aab = baisin y. (4.19b) 


The minus sign appears in (4.18a) and (4.19a) because, as Figure 4.5b shows, 
the angle y = iy is from b to —a, and not from b to a or ato b. 

A word about the logical status of Equation (4.19a) is in order, because 
many books on vector analysis use such an equation to define the inner 
product of vectors in terms of the cosine of the angle between the vectors; 
thus, they take trigonometry as an established subject whose content is 
merely to be reexpressed in vector language. On the contrary, we have 
defined inner and outer products with no reference whatever to trigonometric 
functions, and most of our applications of geometric algebra, including some 
to trigonometry, require no mention of angles. We regard (4.19a) and the 
more general Equation (4.18a) as functional relations between angles and 
vectors, rather than primary definitions of any sort. Using these relations the 
basic trigonometric identities can be derived from the simpler and more 
general identities of geometric algebra almost as easily as they can be written 
down from memory. Of course, the main reason for regarding geometric 
algebra as logically prior to trigonometry is the fact that its scope is so much 
greater. 

Now let us derive the trigonometric formulas for a triangle (4.17) by using 
inner and outer products. We already did this in Chapter 1, but without 
justifying the relation to angles established above. Solving (4.17) for ¢ and 
squaring we get 


c? = (a + b)* = a? + b’ + ab + ba, 


or 

C=a +b + 245 (4.20) 
and, by using (4.19a) to express a-b as a function of angle, we get 

c? = a’ + b’-2abcos y. (4.21) 


This is the law of cosines in trigonometry. The same name may be given to the 
equivalent equation (4.20), though it does not explicitly refer to a cosine and 
has many applications which require no such reference. 

By taking the outer product of (4.17) first by a and then by b or c, we get 
the equations 
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aab = bac = Caa. (4.22) 


From these equations we easily get relations among the angles of the triangle 
by using (4.19b) and the corresponding relations from (4.18b) and (4.18c); 
thus 


sina _ sin 8 _ sin y_ (4.23) 


This set of equations is the law of sines in trigonometry. It should be noted 
that, although (4.23) is an immediate consequence of (4.22), the former is 
somewhat more general than the latter because it includes the factor i 
representing the direction of the plane. The full generality is helpful when 
trigonometric relations in a plane are to be related to 3-dimensional space, as 
in the laws of reflection and refraction. 

Equation (4.22) can be regarded as giving three equivalent ways of deter- 
mining the area of the triangle. We have discussed the interpretation of aab 
as the directed area of a parallelogram; our triangle has only half that area. 
Hence, the directed area A of the triangle is given by 


= tanb = +bac = $eaa. (4.24) 


Using (4.19b), we get the more conventional expression for the area of a 
triangle, 


li, = pe in y. (4.25) 


or, one half the base a times the altitude b sin y. 

The laws of sines and cosines refer to the scalar and bivector parts of 
Equations (4.18a, b, c) separately. A property of the triangle which makes 
more direct use of these equations is derived by multiplying the three 
equations together and dividing by a*b’c’ to get 


elteibeiy = 1. (4.26) 


This says that successive rotations through the three interior angles of a 
triangle is equivalent to rotation through a straight angle. By virtue of (4.14b) 
and (4.15), then, we can conclude that 


at+Bt+y=n. (4.27) 


This familiar result is traditionally regarded as a theorem of geometry rather 
than of trigonometry. But we use exactly the same techniques of geometric 
algebra to prove the theorems of both subjects. 

Geometric algebra is just as effective for formulating and deriving the 
results of spherical trigonometry, as shown in Appendix A. 
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2-4. Exercises 


(4.1) 
(4.2) 


(4.3) 


(4.4) 


(4.5) 


(4.6) 


(4.7) 


(4.8) 


(4.9) 


Deduce Equations (4.14b, c) from (4.14a) by (4.16). 
Prove and interpret the identities 


el(O+2mn) — pi@ pi,-i0 — 4 
where n is an integer. 
From Equation (4.15) derive the trigonometric identities 


cos(é + @) = cos 6 cos ¢- sin 6 sin P 
sin(@ + @) = cos @ sin @ + sin 6 cos d 


Prove that 
ei? 4 9 ie ; ei? _ oid 
cos 8 = ——_——_-, sing= —— 
Z 2i 
: e2i8 = jl 
ian Gi ae era 
er’ + J 


Prove that ab = e’® and a® = b* = 1 imply 


(GQ) cb = bal= e747 
(b) (a~b)? = 4 sin? +6 
(c) (a + b)? = 4 cos? $8. 


Locate the following points on an argand diagram (Figure 2.1b): 


it/4 i37/2 3 i/6 
>. 


1 : 
e : a (BO Vr" te i). 


Solve the equation 


é€ 


il te ef dt e218 ab ef db ei? at ele — 0 


by interpreting the terms as operators on a vector and identifying 
the geometrical figure generated. 

Prove the following identities, and identify trigonometric identities 
to which they reduce when anbac = 0. 


(a) (a:b)’ — (aab) = a’b’. 

(b) b-(aabac) = b-abac — b’aac + b-caab, 

(c) (aab)-(bac) = b’a-c — a-bb-c. 

(d) 2(aab)-(bac) = b’a-c — a:(beb) = —2(a X b)-(b X c) 
(e) 2a-bb-c = b’a-c + a-(beb). 


For the triangle in Figure 4.5a, establish the results 


, sin B sin y 


Al=4 : 
(a) | | 7a awe 


ees 4 = 71. 


72 


(4.10) 


(4.11) 
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(b) Hero’s formula: 


|A PSs ayG= by ec Hs-. 


where s is half the perimeter of the triangle and r is the radius of the 
inscribed circle. 
(c) Half-angle formulas: 


r 


a B 
tan — = Va — 
2 § 


r 

2 s-a 

(d) The Law of tangents: 
a-b , tan sa — B) 
a Dane tap)” 


Prove that the angle inscribed in a semicircle is a right angle, by 

using the vectors indicated in Figure 4.6. 

Let a, b, c be the verticies of a triangle, as shown in Figure 4.7. Note 

that vectors designating points in the figure are not represented by 

arrows; this is because we are not interested in the relation of these 

points to some arbitrarily designated origin. Prove the following 

general theorems about triangles: 

(a) The altitudes intersect at a point. This point p is called the 
orthocenter of the triangle. 

(b) The perpendicular bisectors of the sides intersect at a point. 
Why do you think that this point q is called the circumcenter? 

(c) The medians intersect at a point r = +(a + b + c)|which ‘lies 
on the line segment joining the orthocenter to the circumcenter 
and divides it in the ratio 2/1. 


Clic: 


(2) 
—) 


Fig. 4.6. 
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2-5. The Exponential Function 


We have seen the great value of the exponential function for expressing the 
relation of angles to directions. It first appeared as a combination of sines and 
cosines. Our aim now is to define the exponential function and establish its 
properties from first principles. This will simplify some of our calculations and 
extend the range of applications for the function. 

The exponential function of a multivector A is denoted by exp A or e* and 
defined by 


00 A* 
exp A = e* = 2 


A A Ak 
oe. Ape... (5.1) 


St a ae ki 


This series can be shown io be absolutely convergent for all values of A by 
standard mathematical arguments. Standard texts give the proof assuming 
that A is a real or complex number, but the proof actually requires only that A 
have a definite magnitude | A |, so it is easily extended to general multi- 
vectors, and we shall take it for granted. 

Definition (5.1) is an algebraic definition of the exponential function. This is 
to say that the function is completely defined in terms of the basic operations 
of addition and multiplication, so all its properties can be determined by using 
these operations. The most immediate and obvious property of the exponen- 
tial function is that the value of e* is a definite multivector. This follows from 
(5.1) by the closure of geometric algebra under the operations of addition and 
multiplication. More particularly, it follows that if A is an element of any 
subalgebra, such as “,, “%} or “%,, then the value of e* is an element of the 
same algebra. 

The most important property of the exponential function is the “additivity 
rule”’ 


ete? = e4*8, (22) 


which holds if and only if AB = BA, although there are some trivial excep- 
tions to the “only if” condition (Exercise (5.9)). Indeed, if A and B commute, 
then 


co Am oo Br co n Ark pk 
os { ! a be eA | Ty i 
Nea THD ASU NaS) (n-k)! k! 


But, by the binomial expansion, 


no as n} n-k Pk 
OE) Gan 
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hence, 


* (A+B)" 


ee? = = Cae 


n=0 nn! 


Notice that the commutativity of A with B is needed to apply the binomial 
expansion, so if it is lacking the result (5.2) cannot be obtained. 
We will always be able to write 


ete? = ef, 

but when C + A + B, the problem of finding C from A and B does not have a 
general solution, so different cases must be considered separately. In Chapter 
5, when we study rotations we will solve the problem when A and B are 
bivectors. 


The hyperbolic cosine and sine functions are defined by the usual series 
expansions 


foe] Ac AS Ae 
nj ALS =j]+—+-—t..., $3 
a Re eo (238) 
’ Azk*! Ae Ne 
= 44). eee 3b 
sinh A= 2) Ger)! 31S! 


These are just the even and odd parts of the exponential series (5.1), thus, 
e*=coshA + sinhA. 3c) 


The multivector A is called the argument of each of the functions in (5.3c). 
The cosine and sine functions are defined by the usual series expansions 


Axk 2 4 6 
We 3 eo) ae =1-4 + -2 4... Ga) 
A2ktt 3 5 7 
eS > ie Y am =A-4 +224... Gh) 
If J is a multivector with the properties /° = —-1 and JA = AJ, then it is easily 
shown by substitution in the series (5.3a, b) that 
cosh [A = cos A, (5.5a) 
sinh JA = /sin A, (5.5b) 
and 
e4=cosA+/sinA. (S.5c) 


It will be noticed that this last equation generalizes (4.10). 

When A 1s a scalar, the definitions (5.4a, b) for the trigonometric functions 
reduce to the well-known series expansions for sines and cosines established 
in elementary calculus. Therefore, we are assured that they apply to the 
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trigonometric functions employed in Section 2.4. So, considering (5.5c), we 
have from (5.1) and (4.9) the following expansion of the product of two unit 
vectors in terms of their relative angle: 
a2 va Oo au . 
ab = eae | + 10 a 713; | ee ae (5.6) 
As emphasized before, the exponential function and its series expansion 
require that the angle 6 be measured in radians, though of course it can be 
expressed in degrees by inserting a conversion constant. 

As indicated by (5.3c) and (5.5c) the hyperbolic and trigonometric func- 
tions are best regarded as parts of the more fundamental exponential function. 
We have considered only the basic algebraic properties of the exponential 
function in this section. Differentiation of the exponential function will be 
considered when the need arises, and we will learn more about this remark- 
able function as we encounter it in physical applications. 


Logarithms 


The following discussion of the natural log function can be omitted by readers 
just beginning the study of geometric algebra, as it involves some subtle 
points which can be ignored in elementary applications. However, it will be 
needed as background for our discussion of rotations in Chapter 5. 

The exponential function associates a unique multivector B with every 
multivector A according to the formula 


B=e. (5.7) 


We know that if A is any scalar, then B is a positive scalar, and further, that 
for positive scalars the exponential function has an inverse called the logar- 
ithmic function, for which we write 


A =log B. (5.8) 


This brings up the question: Since we have extended the domain of the 
exponential function to all multivectors, can we not do the same for the 
logarithmic function? The answer is ‘‘no!”’, as long as the logarithm is 
required to be the inverse of the exponential. The reader may be convinced 
by experimenting with the exponential series (5.1) that there is no choice of A 
which can make the series sum to a vector. Thus, no nonzero vector can be 
expressed as an exponential of some other quantity, so vectors cannot have 
logs. 

Although we cannot define the logarithmic function on all multivectors, 
evidently we can define it for any multivector B which can be expressed as an 
exponential of some other multivector as in (5.7). But here we meet another 
difficulty, for the exponential function is many-to-one, that is, there exist 
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many different multivectors A, (k=1, 2, .. .) such that 
Boeet=eV=...=evtt=.... (5.9) 


Therefore, the inverse function must be many-valued, and for any of the A, 
we can write 


A, = log B. (5.10) 


We can make the logarithmic function single-valued by introducing a rule to 
pick out just one of the A, as its value. It is most convenient to choose the one 
with the smallest magnitude. If | A, | < | A, | for k > 1, we write 


A, = Log B, (Sel) 


and call A, the principal part of the logarithm. The restriction to the principal 
part is indicated in (5.11) by the capital L. Ambiguity can still arise, however, 
because for some B there is more than one A, with smallest magnitude; to 
choose between them a new rule must be introduced, such as one considered 
below. 

To be more specific, let us consider the logarithmic function on spinors. 
Any spinor z in &, can be written in the equivalent forms 


je a at al A G2) 


where i is a unit bivector, 6, is a scalar and a is the logarithm of the positive 
scalar A. Evidently, 


logz=a+ié,. (43) 


Thus, every spinor has a logarithm. The logarithm is not unique, however, 
because of the multiplicity of possible values for the angle 6,. Let @ be the 
smallest of these angles. Noting that e””* = 1 for any integer k, we verify that 


ei? = eS eizak = ei(9+27k) ; (5.14) 
Therefore, any of the angles 
6, = 0+ 22k (5215) 


will satisfy (5.12), and evidently no other angles will. Since the possible angles 
differ by any positive or negative multiple of 2z, the smallest of them, 6, must 
be confined to the interval -7 < @ < a. Having determined the allowed 
range of its bivector part, the principle part of the logarithm is well defined, 
and we write 


Logz=a+ié (5.16) 


Ambiguity arises when | @ | = 2, for then both iz and -iz might be allowed in 
(5.16). Choice of one of them, say iz, amounts to a choice of orientation for 
the i-plane. Accordingly, we write 


Log(-1) = iz. Coery) 
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Note also that e*'* = -1, where i is the unit pseudoscalar for “,. So we 
have 


log(-1) = tix (5.18) 


as well. This possibility is eliminated if we are interested only in spinor-valued 
logarithms. In applications the appropriate value for a logarithm will be 
determined by the problem at hand. 

Finally, we note that, although vectors do not have logarithms, the product 
of vectors a and b is a spinor, so we can write 


Log(ab) = log | a| + log | b| + id, (5.19) 


where i@ is the directed angle from a to b. The exponential being the inverse 
of the logarithm, we have 


ab = ebos(ab) = glostab) (5.20) 


2-5. Exercises 


(5.1) Establish the following general properties of the exponential func- 
tion 
(a) The “‘algebraic inverse” of e“ is (e*)"' = e”. 
(b) (e*)” = e” for scalar values of n. 
(c) eg) net. 
(d) If AB = BA, then e*B = Be”. 
(e) If AB = -BA, then e*B = Be“. 
(582) The hyperbolic functions are strictly even or odd functions in the 
following sense 


cosh A = 5s. = cosh(-A), 
A _ gA 
sinh A = © —£— = -sinh(-A). 
Similarly, 


cos(—A) = cos A, 
sin(-A) = -sin A. 


Justify these relations. 
(5.3) Show that if A? = |A|’, then 


cosh A = cosh |A|, 
sinh A = A sinh |A|. 
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and if A? = —| A /?, then 


cosh A = cos|A |, 
sinh A = A sin | A |. 


(5.4) Let a be a vector in 9,(), and write a = an, where nis a unit vector, 
and a = na = ma is possibly negative. Prove that 


cos a = cos a, cosh a = cosh a, 
sin a = nsina, sinh a = n sinha, 
e® = cosa + insina, e* = cosh a + n sinha. 


(5:5) Prove that if J? = 1 and AJ = JA, then 


cosh JA = cosh A, sinh JA = J sinh A, 
e’* = cosh A + J sinh A. 


(5.6) Prove that if J? = —1 and A, B, J mutually commute, then 


cos(A + JB) = cos A cosh B —-/ sinh B sin A, 
sin(A + JB) = sin A cosh B + I sinh B cos A. 


(5.7) Evaluate sin (7/4 + I) and cos (2/3 + I) when /? = -1. 
(5.8) For vectors a and b, show that if | a | > | b |, then 
he te ie 
a-b a aa aaa 
(5.9) Let the angle between unit vectors a and b be 27/3. Showe = a+ b 
is also a unit vector. Define A = aia, B = aib and show that A and 
B do not commute. Show that if a@ is an integer multiple of 27, then 
Equation (5.2) is satisfied. However, Equation (5.2) is not satisfied 
for any other value of a. Show that for a = a, e+e? = -e**?. 


2-6. Analytic Geometry 


This section can be skipped or lightly perused by readers who are in a hurry to 
get on with mechanics. It is included here as a reference on elementary 
concepts and results of Analytic Geometry expressed in terms of geometric 
algebra. 

Analytic Geometry is concerned with the description or, if you will, the 
representation of geometric curves and surfaces by algebraic equations. The 
traditional approach to Analytic Geometry is accurately called Coordinate 
Geometry, because it represents each geometrical point by a set of scalars 
called its coordinates. Curves and surfaces are then represented by algebraic 
equations for the coordinates of their points. A major drawback of Coordi- 
nate Geometry is the fact that coordinates carry superfluous information 
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which often entails unnecessary complications. Thus, rectangular coordinates 
(x, y, Z) of a point specify the distances of the point from the three coordinate 
planes, the coordinate z, for example, specifying the distance from the 
(xy)-plane. Therefore, equations for a geometric figure in rectangular coordi- 
nates describe the relation of that figure to three arbitrarily chosen planes. 
Obviously, it would be more efficient to describe the figure in terms of its 
intrinsic properties alone, without introducing extrinsic relations to lines or 
planes which are frequently of no interest. Geometric algebra makes this 
possible. 

In the language of geometric algebra, each geometrical point is represented 
or labelled by a vector. Indeed, for mathematical purposes it is often simplest 
to regard the point and the vector that labels it as one and the same. Of 
course, we can label a given point by any vector we please, and problems can 
often be simplified by a judicious selection of the point to be labelled by the 
zero vector. But, as a rule, once a labelling has been selected, it is unnecess- 
ary to change it. 

The distinction between a point and its vector label becomes important 
when geometric algebra is used as a language, for then the point, which is 
undefined as a mathematical entity, might be identified with a mark on a piece 
of paper or a “‘place”’ among physical objects. However, the vector label 
retains its status as a purely mathematical entity, and geometric algebra 
precisely describes its geometric properties. It will be noticed also in the 
following that, although some vectors designate (or are designated as) points, 
other vectors describe relations between points or have some other geometri- 
cal significance. ; 

The simplest relation between two points a and b is the vector a — b, which, 
for want of standard terminology, we propose to call the chord from b to a. 
The magnitude of the chord | a—b| is called the (Euclidean) distance 
between b and a. The zero vector designates a point called the origin. Since 
a — 0 = a, the vector a specifies both the point and the chord from the origin 
to the point a, and | a| = | a— 0 | is the distance between the point a and the 
origin. 

Geometric spaces and figures are sets of points. Euclidean Geometry is 
concerned with distance relations of the form | a — b | among pairs of points in 
such spaces and figures. Non-Euclidean geometries are based on alternative 
definitions of the distance between points, such as log | a— b |. However 
interesting it is to explore the implications of alternative definitions of 
distance, we want our definition to correspond to the relations among physi- 
cal objects determined by the operational rules for measuring distance, and it 
is a physical fact that, at least to a high order of approximation, such relations 
conform to the Euclidean definition of distance. For this reason, we will be 
concerned with the Euclidean concept of distance only, and the adjective 
‘““Euclidean”’ will be unnecessary. 
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A set ¢” with elements called points is said to be an n-dimensional Euclid- 
ean Space if it has the following properties: 

(1) The points in ¢” can be put in one-to-one correspondence with the 
vectors in an n-dimensional vector space. (Each vector then is said to 
designate or label the corresponding point.) 

(2) There is a rule for assigning a positive number to every pair of points 
called the distance between the points, and the points can be labelled in 
such a way that the distance between any pair of points a and b is given 
by |a—b | = [ey]. 

Obviously, any vector space can be regarded as a Euclidean space simply by 

regarding each vector as identical with the point it labels. For most math- 

ematical purposes it is quite sufficient to regard each Euclidean space as a 

vector space. However, in physical applications it is essential to distinguish 

between each point and the vector which labels it. This is apparent in the most 
fundamental application of all, the application of geometry to measurement. 

In Chapter 1 we saw that a complex system of operational rules is needed to 
determine physical points (i.e. positions or places) and measure distances 
between them. The set of all physical points determined by these rules is 
called Physical Space. We saw that Euclidean geometry has certain physical 
implications when interpreted as a physical theory. We can now completely 
formulate the physical implications of geometry in the single proposition: 
Physical Space ts a 3-dimensional Euclidean Space. This proposition could be 
called the Zeroth Law of physics, because it is presumed in the theory of 
measurement and so in every branch of physics, although the Law must be 
modified or reinterpreted somewhat to conform to Einstein’s theory of 
relativity and gravitation. Chapter 9 gives a more complete formulation and 
discussion of the Zeroth Law in relation to the other laws of mechanics. 

We label the points of Physical Space by vectors in the geometric algebra 
G,. These vectors compose a 3-dimensional Euclidean space ¢, = (“), 
which can be regarded as a mathematical model of Physical Space. The 
properties of points in ¢,, such as their relations to other points, to lines and 
to planes, require the complete algebra “, for their description. Since &, 
thereby provides us with the necessary language to describe relations among 
points in Physical Space, it is appropriate to call “, the geometric algebra of 
Physical Space. 

The study of curves and surfaces in ¢’, is a purely mathematical enterprise, 
but its relevance to physics is assured by the correspondence of ¢’, with 
Physical Space. By appropriate semantic assumptions a curve in ¢’, can be 
variously interpreted as the path of a particle, a boundary on a surface or the 
edge of a solid body. But in the rest of this section, all such interpretations are 
deliberately ignored, as we learn to describe the form of curves and surfaces 
with geometric algebra. Our results can then be used in a variety of physical 
contexts when we introduce semantic assumptions later on. 
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Straight Lines 


The most basic equations in analytic geometry are those for lines and planes. 
In Section 2-2 we saw that equation 


xAu = 0 (6.1) 


determines a line through the origin when u is a fixed nonzero vector. The 
substitution x > x — a in (6.1) has the effect of rigidly displacing each point 
on the line by the same amount a. From this we conclude that. the line 
= {x} with direction & passing through the point a is determined by the 
equation 


(x — a)au = 0. (6.2) 


It should be noted that this equation determines the line without reference to 
any space in which the line might be imbedded, although, of course, we are 
most interested in lines in ¢,. 


Moment and Directance of a Line 


Equation (6.2) is a necessary and sufficient condition for a point x to lie on the 
line ¥. All the properties of a line, such as its relations to specified points, 
lines and planes can be derived from the defining equation (6.2) by geometric 
algebra. To see how this can best be done, we derive and study various 
alternative forms of the defining equation; each reveals a different property of 
the line. On writing M for the bivector aau, Equation (6.2) takes the form 


xau = M. (6.3) 


Since xauau = 0, multiplication of (6.3) by u' and use of (2.4) as well as 
(2.11) yields 


(xau)-‘u? = x—x'uu' = Mu". 
Hence, for fixed M and u, 
x=(M+a)u' (6.4) 


is a parametric equation for the line /, each point x= x(a) being determined 
by a value of the parameter a. 
Introducing the vector 


d= Mu! =M-u',~ (6.5) 
Equation (6.4) takes the form 
x= d+ ou. (6.6) 


Note that d is orthogonal to u, since, by (6.5), 
d-u= (du), = (M), = 0. 
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So, by squaring (6.6), one obtains the following expression for the distance 
| x | = | x—0| between the origin and a point x on &: 


a= — od +o. 


This has its minimum value when x = d. 
Thus, d is that point on the line # which 
is “closest” to the origin. We call d the 
directance (= directed distance) from 
the point 0 to the line #. The magnitude 
| d | is call the distance from the point 0 
to the line ¥ (Figure 6.1). 

By substituting (6.3) into (6.5) we get Fig. 6.1. Fora line % with direction u and 


the useful expression directance d from the origin. Note the two 
different representations of the moment 
d = xauu' = P; (x), (6.7) M = du = aau. 


where P? is the rejection operator defined in Section 2-4. This tells us how to 
find the directance of a line from any point of the line. 

The bivector M is called the moment of the line ¥. From (6.5) one finds that 
|d = Mj|u!', showing, in particular, that if |u| = 1, the magnitude of 
the moment is equal the distance from the origin to ¥. From Equation (6.3), 
it is clear that any oriented line ¥ is uniquely determined by specifying its 
direction u and its moment M, or equivalently, the single quantity L = u + M = 
(1 + d)u. We shall see that the last way of characterizing a line is useful in 
rigid body mechanics. 


Points on a Line 


A line determines relations among pairs of points on the line. To analyze such 
relations, we put the defining equation (6.2) in a different form. Equation 
(6.2) is equivalent to the statement that the chord x — a is collinear with the 
vector u. Since x and a are any pair of points on the lines, it follows (by 
transitivity) that all chords of the line are collinear. If x, a, b are any three 
points on the line, the collinearity of chords x —- a and b — a is expressed by 
the equation 


(x—a)a(b—a) = 0. (6.8) 


This differs from (6.2) only in the replacement of u by the chord b — a which is 
proportional to it. So (6.8) is equivalent to (6.2) if b and a are distinct points. 
Thus, we have shown that two distinct points determine the equation for a line. 


Barycentric Coordinates 


By expanding (6.8) with the distributive rule and introducing a factor of +, we 
get the equivalent equation 
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yanb = Faax + xb. (6.9) 


Now saab is the directed area of a triangle with vertices a, 0, b and “sides” 
(or chords) a, b, b-—a. The other two terms in (6.9) can be interpreted 
similarly, and it will be noted that any two of the three triangles have one side 
in common. So (6.9) merely expresses the area of triangle as the sum of areas 
of two triangles into which it can be decomposed; this is depicted in Figure 
6.2a when x is between a and b and in Figure 6.2b when it is not. From (6.9) it 
follows that 


aanbax = 0, (6.10) 


so all three vectors and the three triangles they determine are in the same 
plane. Denoting the unit bivector for this plane by i, we introduce the 
notation for directed areas 


B=+aax = Bi, (6.11) 
A= Sxab = Ai. 
This notation is used to denote areas in Figures (6.2a, b). Note that the 
orientation of A and hence the sign of A is opposite in the two figures. 


Let us regard a and b as fixed and let x be any point on the line they 
determine, as indicated in Figures (6.2a, b). We now show that the bivectors 


xX 
b b 
x 
a a 
0 0 
Fig. 6.2a. pe wl Fig. 6.26. A=-|A|i,B=|B]i, 
A+B=(/A|+/B A eRe Bay—8 & [)i. 


A and B or the determinants A and B can be used as coordinates for the point 
x. Recall the Jacobi identity for vectors proved in Exercise (1.11). 


(anb):x + (bax):a + (xaa)-b = 0. 


Now, because of (6.10) the dots in the equation can be dropped, so, if we 
introduce the notation (6.11) and use (6.9), we get 


(A Bye Bb. 
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If the origin is not on the line, then (6.11) implies A and B cannot both vanish, 
so we can solve for 


«=(—o5)e+ (see | (6.12) 


Since A and B are codirectional, we can express this in terms of the scalars A 
and B defined by (6.11); thus, 
_ Aa+ Bb 


= see 63 
* Dees =? 


In the mathematical literature, the scalars A and B in (6.13) are called 
homogeneous (line) coordinates for the point x. They are also called bary- 
centric coordinates, because of a similarity of (6.13) to the formula for center 
of mass (defined in Chapter 6). But unlike masses, the scalars A and B can be 
negative and, as we have seen, they can be interpreted geometrically as 
oriented areas. They have another geometrical interpretation which we now 
determine. 


Division and Intersection of Lines 
Since all chords of a line are collinear, we can write 

a—x=A(x—b), (6.14) 
where A 1s a scalar. The outer product of this with x gives 

xaa = —Axab. 


Solving both these equations for A, and using (6.11) we find 


A= — = — =— =- , (6.15a) 


or 


_B 
are (6.15b) 


where the positive sign applies if x is between a and b and the negative sign 
applies if it is not. The point x is sometimes called a point of division for the 
oriented line segment [a, b], and because of (6.15), x is said to divide [a, b] in 
ratio B/A. The division ratio 4 can be used as a coordinate for points on the 
line through a and b simply by solving (6.14) to get 


x = 2 Ab 
1+A ’ 


(6.16) 


which, of course, is equivalent to (6.13). Thus the midpoint of [a, b] is defined 
by the condition A = 1 and is given by +(a + b). 
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It should be noted that the division ratio (6.15) is not only independent of 
an orientation for the line or for the (aab)-plane, it is also, independent of 


the location of the origin (Even the 


ratio B/A = 0/0, which occurs when 


Oe ae the origin is on the line, is deter- 
c mined by (6.15)). This helps us de- 
—— duce a large number of geometrical 


facts, for by displacing the origin by 
an arbitrary vector c, we get from 
(6.15) the following general rela- 


tions among the quantities indi- 


0 Feo! cated by Figure 6.3: 
a-x B B’ B + B’ 
= ——_ = — = — = 6.17 
iaiies | ane Wad Va eS (ong) 
or, in terms of determinants for the directed areas, 
B B' BB’ 
Sala = 6.17b 
Ta ee (6.17b) 
or, in terms of vectors, 
g = AAX _ (a—c)a(x—c) _ aax + Caa + xac 
xab (x — c)A(b—c) xab + bac + cax 
CAA + XAC (x —a)ac 
= =. 6.17 
bac + cax ca(x — b) oe) 
re c These relations hold even if ¢ is not in the 


(aab)-plane. 
For the special case when c is collinear with x, 


we can have one of the three cases depicted by 
a Figures (6.4a, b, c). All three cases are governed 
by (6.17). However, the point x also divides the 


O Fig. 6.42. 
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line segment [c, 0] in some ratio A’ given by 


(6.18) 
The point x is, in fact, the point of intersection of the line through points 0, c 
with the line through points a, b. From any one of the figures we see that 
A+B=dSanb, (6.19a) 
and 
A’ +B’ =+(a-c)a(b-c). (6.19b) 
So (6.18) gives us 


aab 


i _ ea 6.20 
«— GannGes) CE 
and 
Cc 
Pee (6.20b) 


These equations give us the point 
of intersection in terms of the vec- 
tors a, b, c. They also determine 
the point x in Figure 6.5, and, by 
interchanging a and c, they deter- 
mine the point y in the same 
figure. Thus, a small number of 
algebraic formulas describe a wide 
variety of geometric relations. 


0 
Fig. 6.5. 


Planes 


The algebraic description of a plane is similar to that for a line, so we consider 
planes only briefly to make this fact clear. The plane with direction U passing 
a given point a is determined by the equation 


(x—a)AU = 0, (6.21) 


where U is a non-zero 2-blade. Planes with equal or opposite directions are 
said to be parallel to one another. Every plane 7’= {x} with direction U is the 
solution set of an equation with the form 


xaU =T. (6.22) 


This equation is analogous to (6.21) when T = aaU. The trivector T is 
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called the moment of the plane. The directance d from the origin to 7 is 
given by the vector 


d= TU" =-T-U". (6.23) 
The magnitude | d | = | T || U |"! is the distance from the origin to the plane. 
Finally, using (6.22) in (6.23) we get 

d = xaUU" = P3 (x). (6.24) 


Thus the directance can be obtained by “rejecting” any point of the plane. 


Spheres and Circles 


Besides lines and planes, the most elementary geometrical figures are circles 
and spheres. A sphere with radius r and center ¢ is defined as the set of all 
points x in ©, satisfying the equation 


|x-e| =r, (6.25a) 
or, equivalently, 
(x-—c)? =r’. (6.25b) 


Besides (6.25), the points of the sphere satisfy the equation (x — c)ai = 0, 
which is an algebraic formulation of the condition that each point x belongs to 
Cas 

The intersection of the sphere with a plane through its center is a circle. 
Thus, a circle with center ¢ and radius a in the i-plane is determined by 
supplementing (6.25) with the condition 


(x-c)ai=0. (6.26) 


Equation (6.25) can be regarded as the equation tor a circle without explicitly 
writing (6.26) if it is understood that the points are in €,. 

The general solution of the simultaneous Equations (6.25) and (6.26) has 
the convenient parametric form 


xc=T.e 5 (6.27) 


where r, is a fixed vector in the i-plane (r,ai = 0) with magnitude | r, | = r. 
The fact that (6.27) solves (6.25) and (6.26) is readily verified by substitution. 
With the bivector i normalized to unity, (6.27) associates exactly one value of 
6 with each point x on the circle if the values of 6 are restricted by the 
condition 0 < @ < 2x. The generalization of (6.27) to a parametric equation 
for a sphere will be made later in the chapter on rotations. We concentrate 
here on general properties of circles. Equation (6.27) is only one of many 
useful parametric equations for a circle. Equation (6.27) is not very helpful 
when one is concerned only with points on the circle and not with the center 
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of the circle. A more useful parametric equation can be derived from the 
constant angle theorem for a circle. This theorem states that a given arc of a 
circle subtends the same angle ¢ at every point x on the circle outside that arc. 
We prove the theorem by showing that @ = +6 for the angles indicated in 
Figure 6.6. According to the figure, 


ake" 


Our arguments will apply to all circles through the points a, b if we allow 6 to 
have any value in the interval [-7, 2]. The angle of interest @ is defined by the 
equation 


(a—x)" (b—x) = Ae’? (6.28) 
Writing a 'x = e'“ for convenience, we observe that 

a! (b-—x) = e®? - e!@ 
and 

(a—x)'a = [a (a-x)}' = (1-—e*)". 
Hence 


ei? — gia 


(a—x)’ (b—x) = (a—x)'‘ aa’ (b-x) = a 


Inserting this in (6.28), we get 


We can eliminate A by dividing this equation by its reverse, and we find that a 
disappears as well; thus, 


ié ia —ia 
Cra Co ae 


—ia 


Co OO a ces 
1-e* e¥_e 


To solve for ¢, we must consider the two square roots 
C= ae, (6.29) 


This gives us two different values for @, which we denote by @ and @’ 
respectively. 


The positive root from (6.29) gives ¢ = +6, as claimed earlier since this 
result is independent of a, it holds for every point x on the circle outside the 
arc. This completes the proof of the constant angle theorem, but we can 
deduce more. The negative root from (6.29) gives ¢’ = +@- mif @ > 0 and 
p' = +0 + wif 6 < 0. This relation holds for every point x’ on the given arc 
of the circle, as indicated in Figure 6.6. Angles for the two cases are obviously 
related by @’ = @ + a, or equivalently by 
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eo = te (6.30) 


From this we can conclude that (6.28) applies with fixed ¢ to all points on the 
circle if the parameter A is allowed to have negative values. 

We have, in fact, proved that (6.28) can 
i-plane b be regarded as a parametric equation for a 
circle in the i-plane passing through the 
points a and b. Points on the circle are 
distinguished by the signed ratio of their 
distances from a and b, 


Esl 


ce (6.31) 


The two signs of A corresponding to the 
two arcs into which the circle is cut by a 
and b. Each finite value of A determines a 
distinct point of the circle, while the singu- 
Fig. 6.6. lar values A = + © determine the point 
x= a. 


Equation (6.28) helps us answer many questions about circles. For exam- 
ple, to find an equation for the circle passing through distinct points a, b, d, 
we write 


(a—d)" (b—d) = de”. (6.32) 


This determines the angle @ in (6.28). Taking the ratio of (6.28) to (6.32) we 
have the desired parametric equation, 


(a-x)'(b-x) _ A 
(a—-d)'(b-d) 6 


The parameter A’ has the values + ©, 0, 1 at the points x = a, b, d respectively. 

The quantity on the left of (6.33) is called the cross ratio of the points a, b, 
x, d. It is well-defined for any four distinct points. From our derivation of 
(6.33) we can conclude that four distinct points lie on a circle if and only if 
their cross ratio is a scalar, 

Returning to (6.28), we observe that each value of @ in the interval (—7/2, 
z/2) determines a distinct circle, with the value @ = 0 determining a straight 
line, which may be regarded as a circle passing through infinity. Thus, (6.28) 
describes the 1-parameter family of circles in the i-plane which pass through 
the points a and b, as shown in Figure 6.7. On the other hand, if A is fixed and 
positive while @ varies, then (6.28) describes the set of all points in the i-plane 
whose distances from a and b have the fixed ratio A. This set is also a circle, 
called the circle of Appolonius. By varying A we get the 1-parameter family of 
all such circles (Figure 6.8). As shown in Figure 6.9, the circle with constant @ 
intersects the circle with constant A in two points x and x’ distinguished by the 


=). (6.33) 
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lAl < 1 


¢>0 


Peel 


Fig. 6.7. The 1-parameter family of cir- Fig. 6.8. Circles of Appolonius. 
cles through points a and b. 


values +|4| and -|A| respectively. @ = const. 


Each point in the i-plane can be desig- 
nated in this way, so (6.28) can be re- 
garded as a parametric equation for the 
i-plane. The parameters A and @ are 
then called bipolar coordinates for the 


b a 

plane. a 
Conic Sections , 

Next to straight lines and circles, the 

simplest curves are the conic sections, 

so-called because each can be defined 

as the intersection of a cone with a 

plane. We shall prefer the following 

alternative definition, because it leads 


directly to a most valuable parametric 

equation: A conic is the set of all points ae 

in the Euclidean plane ¢’, with the property that the distance of each point 
from a fixed point (the focus) is in fixed ratio (the eccentricity) to the distance 
of that point from a fixed line (the directrix). To express this as an equation, 
we denote the eccentricity by ¢, the directance from the focus to the directrix 
by d = dé with é° = 1, and the directance from the focus to any point on the 
conic by r (see Figure 6.10). The defining condition for a conic can then be 
written 
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directrix | r | 

d-ré 
Solving this for r = | r | and introduc- 
ing the eccentricity vector € = e& along 
with the so-called semi-latus rectum 
{ = ed, we get the more convenient 
equation 


= &. (6.34) 


focus 
Fig. 6.10. € 

r= - 6.35 
8 Ce) 
This expresses the distance r from the focus to a point on the conic as a 
function of the direction Ff to the point. Alternatively, the same condition can 
be expressed as a parametric equation for r as a function of the angle 6 

between é and r. Thus, substituting e-F = € cos @ into (6.35), we get 

é 


is 1+ecosd- (625) 


This is a standard equation for conics, but we usually prefer (6.35), because it 
shows the dependence of r on the directions é and fF explicitly, while this 
dependence in (6.36) is expressed only indirectly through the definition of 0. 

Equation (6.35) determines a curve when r is restricted to directions in a 
plane, but if r is allowed to range over all directions in ¢,, then (6.35) 
describes a 2-dimensional surface called a conicoid. Our definition of a conic 
can be used for a conicoid simply by interpreting the directrix as a plane 
instead of a line. Both the conics and the conicoids are classified according to 
the values of the eccentricity as shown in Table 6.1. 


TABLE 6.1. Classification of Conics and Conicoids. 


Eccentricity Conic Conicoid 
Epa hyperbola hyperboloid 
e=1 parabola paraboloid 
Ue eel ellipse ellipsoid 
e= 0 circle sphere 


The 1-parameter family of conics with a common focus and pericenter is 
illustrated in Figure 6.11. The pericenter is the point on the conic at which r 
has a minimum value. In the hyperbolic case there are actually two peri- 
centers, one on each branch of the hyperbola. Only one of these is shown in 
Figure 6.11. If the conics in Figure 6.11 are rotated about the axis through the 
focus and pericenter, they “‘sweep out” corresponding conicoids. 

The conics and conicoids have quite a remarkable variety of properties, 
which is related to the fact that they can be described by many different 
equations besides (6.28). Rather than undertake a systematic study of those 
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properties, we shall wait for them to arise in the context of physical problems, 
and we will be better prepared for this when we have the tools of differential 
calculus at our disposal. 

Our study of analytic geometry has just begun. 
The study of particle trajectories, which we under- 
take in the next chapter, is largely analytic 
geometry in ©,. For those who wish to study the 
classical analytic geometry in ©, in more detail, the 
book of Zwikker (1963) is recommended. He for- 
mulates analytic geometry in terms of complex 
numbers and shows how much this improves on the 
traditional methods of coordinate geometry. Of 
course, everything he does is easily reexpressed in 
the language of geometric algebra, which has all 
the advantages of complex numbers and more. 
Indeed, geometric algebra brings further improve- 
ments to Zwikker’s treatment by enlarging the 
algebraic system from G} to S,, and so introducing 
the fundamental distinction between vectors and 
spinors and along with it the concepts of inner and 
outer products. Most important, geometric algebra provides for the generali- 
zation of the geometry in ¢, to ¢,. The present book develops all the 
principles and techniques needed for analytic geometry, but Zwikker’s book 
is a valuable storehouse of particular facts about curves in ¢’,. Among other 
things, it includes the remarkable proof that conic sections as defined by 
(6.34) really are sections of a cone. 


Fig. 6.11. Conics with a com- 
mon focus and pericenter. 


2-6. Exercises 
(6.1) From Equation (6.2) derive the following equations for the line ¥ in 
terms of rectangular coordinates in ¢’,: 
Se ee X,- 4, 
U, u, u, 


where X, = x°O,, a, = a'o,, Uy = WG,. 
(6.2) (a) Show that Equation (6.2) is equivalent to the parametric equation 


x=a+tdAu'. 
(b) Describe the solution set {x = x(¢)} of the parametric equation 
x=at+?fu 


for all scalar values of the parameter ¢. 
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(6.3) (a) Compute the directance to the line through points a and b from 


the origin. 
(b) Compute the directance to this line from an arbitrary point c. 
(6.4) Prove the theorem ‘Three points not on a line determine a plane” 


by using geometric algebra to derive an equation for the plane from 
three points a, b, c. 
(6.5) Describe the solution set {x} of the simultaneous equations 


rian == (0) Aa) 


if A and B are noncommuting blades of grade 2. 

(6.6) Find the point of intersection of the line {x} determined by the 
equation (x -— a)au = 0 with the plane {y} determined by the 
equation (y — b)~B = 0. What are the conditions on a, b, u, B that 
this point exists and is unique? 

(6.7) The directance from one point set to another can be defined quite 
generally as the chord of minimum length between points in the two 
sets, provided there is only one such chord. 

Determine the directance d from a line with direction u through a 
point a to a line with direction v through a point b. Show that the 
lines intersect only if (a — b)auav = 0. 

(6.8) Compute the directance from a point b to the plane {x: (x -— a) 
AU = 0}. 

(6.9) Show that the equation 


x = aa + Bb + yc 
subject to the conditions 
a+pPB+y=1 and aabac #0 


can be regarded as a parametric equation for a plane. Find a 

nonparametric equation for this plane. 

(6.10) | Ceva’s Theorem: Suppose that con- 
current lines from the verticies a, b, 
c of a triangle divide the opposing 
sides at a’, b’, c’ (Figure 6.12). 
Then the division ratios satisfy 


a—c’ ba" | c—b’ if 
c’-b/\a’'-c/\b’-a 


Prove by showing that the areas 
indicated in the figure satisfy 
Fig. 6.12. Ceva’s Theorem. Are aC. 7 


o> 


i 


(6.11) In Figure 6.5 we have the division ratios 
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pee So pac 0 wee! 
~ b-a’ eee y-0° 


Prove the theorem of Menelaus: Auv = -1. Note that the theorem 
can be interpreted as expressing a relation among intersecting sides 
of a quadrilateral with verticies 0, a, b, ¢ or as a property of a 
transversal cutting the triangle with verticies 0, a, x. 
(6.12) Prove that three points a, b, c lie on a line iff there exist nonzero 
scalars a, B, y such that aa + Bb + ye = Oanda+B+y=0. 
(6.13) | Desargues’ Theorem. Given 
two triangles a, b, c and a’, 
b’, c’. Then lines through 
corresponding verticies are 
concurrent (at a point s) iff 
lines along corresponding 
sides intersect at collinear 
points (p, q, r). (Figure 6.13) 
Note that the triangles need 
not lie in the same plane. 
(6.14) The equation (x — b)-u = 0 
describes a plane in ©, with 
normal u. Derive this equa- 
tion from Equation (6.21). 
(6.15) | Four points a, b, c, d determine a tetrahedron with directed volume 


V =< (b-a)a(c—a)a(d - a) 


= & (bacad — cadaa + daanb — aabac). 


Fig. 6.13. Desargue’s Theorem. 


Use this to determine the equation for a plane through three distinct 
points a, b, c. 

(6.16) Let a, b, ¢ be the directions of three coplanar lines. The relative 
directions of the lines are then specified by a=b-c, B = acc, 
y = avb. Prove that 


2aBy=a+P+y'-1. 


(6.17) | Determine the parametric values A, , A, for which the line x = x(A) = 
a + Au (u’ = 1) intersects the circle with equation x’ = 7°, and 
show that A,A, = a’ — r for every line through a which intersects the 
Cifete.. 

(6.18) | Show that tangents to the circle of radius r and center at the origin in 
€, which pass through a given point a intersect the circle at the 
points. 


d. =(1 Si) = (r-a,i)ra', 
r 
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(6.19) 


(6.20) 


(6.21) 


where a, = +(a’- r’)'’” and i is the unit bivector. 

Find the radius r and center ¢ of the circle determined by Equation 
(6.28). 

Let x and y be rectangular coordinates of the point x. Show that the 
defining equation (6.35) of a conic is equivalent to the equations 


x y’ x2 yi 
+-—-=1, —=-Z=1, 
a b? a Be 


for an ellipse and a hyperbola respectively, where 


al — bead. x=rt+ae. 


The curves and related parameters are shown in Figures 6.14a, b. 


Fig. 6.14a. Ellipse. 


Fig. 6.14b. Hyperbola. 


Use the above equations to show that an ellipse has a parametric 
equation x = x(@) with the explicit form 


x=acos@¢?+ bsin ¢, 
while a hyperbola has the parametric equation 
x = acosh @ + b sinh ¢, 


where a’ = a’, b* = D’ and a-b = 0. 
Parametric curves x = x(A) of the second order are defined by 


equation 
a, + aA + ad 
8S SS SS 8 
a,+ aat a/v 


Note that this generalizes the Equation (6.16) for a line. By the 
change of parameters A > A -— a,/2a,, this can be reduced to the 
form 
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gol By + aa 
oa 


We now aim to show that this equation describes an ellipse iff a > 0, 

a parabola iff a = 0, and a hyperbola iff a < 0, where iff means “if 

and only if’. Thus, all conics are second order curves and con- 

versely. Show that 

(a) For a = 1, the change of parameters A = tan }@ enables us to 
put the equation in the form 


x=acosg+bsngt+c, 
which we recognize as a general equation for an ellipse. 
(b) For a = -1, A = tanh +¢@ gives 
x=acosh¢+bsinh@+c. 


(6.22) Solve Equation (6.28) for x and put it in the general form given in 
exercise (6.21). 

(6.23) Let x = x(A) = o,z(A) describe a curve in ¢’,. Identify and draw 
diagrams of the curves determined by the following specific forms 
for the spinor z. 


(a) z= (1 + id)'” 


1 
De ae 
(c) z = (1- iA) e* 
(@) z= (1-iay”, 


(6.24) Describe the solution set {x} in @, determined by the following 
equations. Comment, especially, on the dependence of the solution 
set on vector parameters a, b, c. 


(a) (a:x)? = x’. 

(b) ax 2=(x|. 

(cy (aea— 0; “fax 0. 

(d) ((ax)*), = 1. 

(e) xa + (xaa)? = 0. 

(f) (a-x)? = x and (x-—c)-b = 0. 


2-7. Functions of a Scalar Variable 


In this section we review some basic concepts of differential and integral 
calculus to show how they apply to multivector-valued functions. 
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If to each value of a scalar variable t there corresponds a multivector F(t), 
then F = F(t) is said to be a (multivector-valued) function of t. It is important 
to distinguish between the function F and the functional value F(t,), the 
particular multivector which corresponds to the particular scalar ¢,. However, 
it is often inconvenient to make that distinction explicit in the mathematical 
notation, so the reader will be left to infer it from the context. Thus, F(t) will 
denote a functional value if ¢ is understood to be a specific real number, but 
F(t) will denote a function if no specific value is attributed to r.* Similarly, 
when the variable ¢ is suppressed, F may indicate a value of the function 
instead of the function itself. So F = F(t) may refer either to a function or a 
functional value. It should be understood, also, that the function F(t) is not 
completely defined until the values of the variable ft for which it is defined 
have been specified. However, the reader will usually be left to infer the 
allowed values of a variable from the context. 


Continuity 


The function F(t) is said to be continuous at t, if 


lim | F(t) -— F(t.) | = 0. (71a) 
( aly 

We write 
lim F(t) = F(t,), or F(t) > F(t,). (7.1b) 
tag 


The definition of ‘‘limit”’ presumed in (7.1a) is the same as the one introduced 
in elementary calculus. It applies to multivectors, because we have already 
introduced an appropriate definition of the “absolute value” | F(t) |, and, in 
spite of the fact that the geometric product is not commutative, it can be 
proved that for multivector-valued functions F(t) and G(t) we have the 
elementary results 


lim F(t) + Gi) = lim F(t) + lim G(#), (7.2a) 
t—> Io t—> lo ty 
lim F(t)G(t) = lim F(¢) lim G(o). (7.2b) 
aly fel, frome they 
To this we can add the (almost trivial) result 
¢ lim F()), = lim <F(t)),. (7.2¢) 
t> tb 


It follows, then, that the function F is continuous if its kK-vector parts (F), 
are continuous functions. 


*It may be noted that a variable is itself a function. Given a set, the variable on that set ts just the 
identity function, namely that function which associates each element of the set with itself. So 
F(t) can be interpreted as the composite of two functions F and ¢. 
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Scalar differentiation 


The derivative of the function F = F(t) at the point t, is denoted by F = F(t,) 
or dF/dt = dF/dt (t,) and defined by 


-—_ = lim F(t, F At) = F(t) ; 


F 


(723) 
This will sometimes be referred to as a scalar derivative to emphasize that the 
variable is a scalar and to distinguish it from the vector derivative to be 
defined in a subsequent volume, NFII. Unless there is some reason to believe 
otherwise, it will usually be convenient to make the tacit assumption that 
functions we deal with have derivatives that “exist” in the mathematical 
sense. Such functions are said to be differentiable. 

The derivative of F = F(t) is itself a function F = F(t), so we can con- 
template its derivative. This is called the second derivative and denoted by 
F = Fide. Similarly, derivatives of higher order are defined as in elemen- 
tary calculus. 

We will be particularly interested in curves representing paths (trajectories) 
of physical particles. Such a curve is described by a parametric equation 
x = x(t), a vector-valued function of the time ¢. The derivative x = x(f) is 
called the velocity of the particle; it is, of course, defined by (7.3), which we 
can put in the abbreviated form 

dx : Ax 
x= ae ine ae (7.4) 
The curve and vectors involved in the derivative are shown in Figure 7.1. The 
derivative of the velocity, 
d’x : Ax 


ae aay Ui) 


is called the acceleration of the particle. 

From multivector-valued functions 
F = F(t) and G = G(t) we can form 
new functions by addition and multipli- 
cation. Their derivatives are subject to 
the rules 


<(F + G)=F +46) (76a) 


(FG) = FG+ GF, 


d ; 
a = Gy (7.6c) 
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These rules can be derived from (7.2a, b, c) by arguments of elementary 
calculus. The proof of (7.6c) depends in addition on the axioms of geometric 
algebra. It must be realized that (7.6b) differs from the result in elementary 
calculus by requiring that the order of factors in the product, be retained. In 
particular, (7.6b) implies 

dF° d . . 

are, (FF) = FF + FF. Gh) 
The right side of (7.7) is equal to the elementary result 2FF only if F 
commutes with F. To understand the significance of this deviation from the 
elementary result, consider the special case of a vector function v = v(t) 
which might be, for example, the velocity of a particle. Now v’ = v’, where 


v = | v |, so from (7.7) we get 
a = 2vi = wt w=2v-v. (7.8) 


With a change in ¢, the vector v = vv can undergo changes in magnitude v 
and in direction ¥. If v is constant (i.e. constant speed), then (7.8) implies 


vv=0, (7.9a) 
or equivalently, 
vv =-W. (7.9b) 


Thus a vector which undergoes changes in direction only is orthogonal to its 
derivative. On the other hand, if the direction ¥ is constant, then v = vv, and 
(7.8) reduces to 


dt 


We see, then, that (7.7) need not reduce to the equation dF’/dt = 2FF if F 
changes in direction, but it will if F changes in magnitude only. Continuous 
changes in direction are not considered in elementary calculus, which deals 
with scalar-valued functions only; changes in direction of scalar-valued func- 
tions are limited to changes in sign. 


= 2uv = 2vw = 2v-v. (7.10) 


Constant Magnitude 


It is now merely an algebraic exercise to show that the vector-valued function 
v = v(t) has constant magnitude if and only if there exists a bivector-valued 
function Q = §2(t) such that 


v=vQ. (7.11) 


Expressing 9 as the dual of a vector w by writing 2 = iw, we have 


SELPONT COLLEGE LIBRARY 


100 Developments in Geometric Algebra 


v2 = v-(iw) = ivAoa=-VXW=OXYV; 
hence (7.11) is equivalent to the equation 
V=oXv. (ae) 
To show that (7.11) implies that | v | is constant, we use the algebraic identity 
v:'(v'Q) = (vaAv):2 = 0 


to prove that v-v = 0, so the result follows from our previous considerations. 
To prove the converse, suppose that 2 exists and we introduce a bivector B to 
write 


Q=viAv+B. (7.13) 


If |v | is constant, then v-v = 0, which implies that v'av = v'v and 
v-(v 'av) = v. Therefore, (7.13) satisfies (7.11) if v-B = 0, but this condition 
implies that the dual of B is collinear with v, so there exists a scalar A such that 
B = div", and 


Q=Vv'av tdiv' =v'(v + i). (7.14) 


This shows not only that (7.11) has a bivector solution as required for our 
proof, but that the solution is not unique without some condition to deter- 
mine the scalar-valued function A = A(t). 


Chain Rule 


To complete our review of scalar differentiation, we consider the effect of 
changing variables. When F = F(s) is a function of a scalar variable s, and 
s = s(t) is in turn a function of a scalar variable ¢, then by substitution one has 
F = F(s(t)) = F(t). Now F(s) is not generally the same function as F(t), but 
the values of both functions are identical for corresponding values of s and f, 
and this is emphasized by using the same symbol F for both functions as well 
as functional values. The derivatives of F with respect to both variables are 
related by the familiar chain rule of elementary calculus: 


7 ae. ca ) 


Scalar Integration 


Like the rules for differentiation, the rules for integration of multivector- 
valued functions of a scalar variable can be taken over directly from elemen- 
tary calculus as long as the order of noncommuting factors is retained in any 
products. Accordingly, we have the familiar formulas 
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b a 
[’ F(t) a=-[" A dt, (7.16a) 


“fb b b 
| c {F(t) + GO) dt = | F(t) dt + i G(t) dt, (7.16b) 
a a 
and fora<c<b, 
b c b 
| F(t) dt = | F(t) dt + | F(t) dt. (7.16c) 
a a Cc 
If A is a constant multivector, then 
b b 
| AF(t) u=af F(t) dt (7.17a) 
a a 


A. (7.17b) 


b b 
I F(t)A dt = il F(t) dt 


These two formulas are not equivalent unless A commutes with all values of 
F(t) over the interval [a, b]. Integrals can be separated into k-vector parts 
according to the formula 


( | ° F(t) de ) = i ° CF(t)), dt. (7.18) 


This follows from (1.10), for, being the limit of a sum, the integral has the 
algebraic properties of a sum. 

Scalar differentiation and integration are related by two basic formulas; 
first, the “‘fundamental formula of integral calculus’, 


bo b 
| F (t) dt = F(t) 


evaluating the integral of a derivative; second, the formula for the derivative 
of an integral: 


= F(b) - F(a), (7.19) 


4 i F(s) ds = F(t). (7.20) 


To simplify the notation, the latter formula is sometimes written in the 
ambiguous form 


dig! _ 
al F(t) dt = F(t), 


wherein ¢ has two different meanings. 
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Taylor Expansion 


With the fundamental formula (7.19), one can generate Taylor's formula 
F(t +s) = F(t) + sF(t) + = Foy. 


co gk gk 
i. ja hl dr* 


This power series expansion applies to any function F possessing derivatives 
of all orders in an interval of the independent variable containing tandt + s. 
It is of great utility in mathematical physics, because it often enables one to 
approximate a complicated function F by a tractable polynomial function of 
the independent variable. 

The derivation of Taylor’s formula is worth reviewing to recall how the 
expansion is generated by the fundamental formula (7.19). To begin with the 
fundamental formula allows us to write 


F(t). (7.21) 


tts. 
=| F(v) dv = F(t + s) — F(t). 
t 
With the change of variables v = ¢ + s — u, the integral can be written 
p=" 0+ s—w du 
0 


Next, the fundamental formula (7.19) and the product formula (7.6b) enable 
us to “integrate by parts”’ to get 


l=UuF@+ s—u) 


: + [uke + s—w) ds 
u=0 0 


==)sF () +f’ uF(t + s—u) du. 
0 


Again, integration of the second term by parts yields 


. s ae Ss ue 
I=sF(t)+—F(t) + — F(t + s—u) du. 
2! 9 2! 
Thus we have generated the first three terms in the series (7.21). Moreover, if 
the series is terminated at this point, we have an exact integral expression for 
the remainder, namely J, (u?/2!) F(t + s — u) du, which is sometimes useful 
for estimating the error incurred in the approximation. 


2-7. Exercises 


(7) Let u = u(t) be a vector-valued function, and write u = | u |. Show 
that 
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d (=) = sass) Gx) xe 


dt \u ue ue 
(7.2) For a multivector-valued function F = F(t), show that: 
(a) For any integer k > 1, 
k 
= be a a ae ana a 
(b) If for each ¢ the value of F has an algebraic inverse F-', then 
COS ae 
dt 
(c) If| F |? = FTF, then 
ae te 
= Oe 
AL = xF'F), 


(d) If | F |? is constant, then 
C= 0, andi) a0. 


(7.3) Prove that F(t) = e*'B for any constant multivectors A and B if and 
only if 


F(t) = AF(t). 


This relation between F and F, the simplest possible relation 
between a multivector-valued function and its scalar derivative, 
could be taken as the defining property of the exponential function, 
since it can be used to generate the infinite series (5.1) representing 
the exponential function. Note that F(1) = e*B. 


(7.4) For F = F(t) = | F| F show that FF = FF if 


dlFl 4, oF 


dt dt = he 


Show that the converse is true if F is a k-blade; find a counter- 
example to prove that the converse is not true more generally. 


(7.5) Use Equation (7.6b) to prove 


ade = fap + rap 
ae 


d ; 
— a p= Tap + Tp, 
qe P Pp 
where r = r(t) and p = p(¢) are vector-valued functions. 
It should be clear that as long as the order of factors is retained, 
the rule for differentiating all products is essentially the same, 
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whether applied to inner or outer products as above or to the 
geometric product as in (7.6b). Express the derivatives of p-(qar) = 
p X (r X q) and paqar = ip-(q X r) in terms of p, q and r. 
(7.6) From Exercise (7.3) we have, 
d 


ae e“' = Ae = eA, 


for any constant multivector A. Use this to prove 


-<cosh( As) = A sinh(Ad), 


—<_-sinh(As) = A cosh(At), 


~<-cos( At) 2 _AgH(AD),. 


~<sin(At = A cos(At). 


CPP) Show that the derivative of any vector-valued function v = v(t) can 
be expressed in the form 


: ie 2 
v=vQ+sv — logy. 
DY ae 
The first term on the right describes the rate of direction change, 
while the second term describes the rate of magnitude change. 


2-8. Directional Derivatives and Line Integrals © 


The laws of physics are expressed as mathematical functions of position as 
well as time. To deal with such functions, we must extend the differential and 
integral calculus of functions with scalar variables to a calculus of functions 
with vector variables. In a subsequent volume, NFII, the general concept of 
differentiation with respect to a vector variable will be developed, but here we 
restrict Our considerations to a special case which is closely related to scalar 
differentiation. The main results of this section will first be used in Sections 
3-8 and 3-10, so study of this section can be deferred to that point. 

Let F = F(x) be a multivector-valued function of a vector variable x 
defined on some region of the Euclidean space ¢’,. In physical applications the 
symbol x will denote a place in Physical Space, in which case F is said to be a 
“function of position’. Such a function is also called a field, a vector field if it 
is vector-valued, a scalar field if it is scalar-valued, a spinor field if it is 
spinor-valued, etc. 
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Directional Derivatives 


The directional derivative of the function F = F(x) is denoted by a-VF or 
a-VF(x) and can be defined in terms of the scalar derivative by 


F(x + ta) — F(x) 
a: 


aVF= dF(x + ar) 
dt 


= lim 
r=0 tH 0 


(8.1) 


Many authors call a-VF a directional derivative only if a is a unit vector. In 
NFII it will be seen that the dot in a-V can actually be interpreted as an inner 
product, but for the time being, it can be regarded as a special notation. 
However, it is important to note that, just as the distributive property of the 
inner product would require (a + b)-V = a-V + b-V, so, by an elementary 
mathematical exercise with limits, it can be proved from (8:1) that 


(a+ b)-VF=a-VF + b-VF. (8.2a) 
Moreover, for any scalar A, 
(Aa): VF = A(a-VF). (8.2b) 


Besides this, the directional derivative obviously has all the general properties 
of the scalar derivative which were mentioned in the preceding section; thus, 
for multivector-valued functions F = F(x) and G = G(x), we have 


a-V(F+G) =a-VF+a-VG, (8.3a) 
a-V(FG) = (a-VF)G + FlaVG), (8.3b) 
aV( FP), =“a VP). -  (8.3c) 


Also, if F = F(A(x)) where A = A(x) is a scalar-valued function, then we have 
the chain rule 


oe 
Vii Vi. 8.4 
aVF = (a'VA) (8.4) 
The most basic function of a vector variable is the “identity function” 


F(x) = x. To determine its directional derivative, we observe that d/dt 
(x + at) = aso according to the definition (8.1) 


a'Vx =a. (8.5) 
Obviously, the ‘‘constant function” F(x) = A has the trivial derivative 
aVA = 0. (8.6) 


From these basic derivatives, the derivatives of more complicated functions 
can be determined by using the general rules (8.3) and (8.4) without further 
appeal to the definition (8.1). In particular, the derivative of any algebraic 
function of x can be determined in this way. For example, to differentiate the 
function |x|, we note that it is related to x by the algebraic equation 
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| x | = (x)'*, or, more simply, by | x |? = x°. Using the product rule (8.3b), 
we have 


a'Vx’ = ax + xa = 2a’x. (8.7) 
On the other hand, the chain rule gives 
aV |x 


2=2\|x\|ayV |x]. 
So, equating this with (8.7), we get the desired result 


aVix|= = ak. (8.8) 


rx 


It is helpful to know the derivative of the “direction function” X as well as the 
“magnitude function” 
chain rule as follows: 
x x a'V|x a aRXx 

sl ee eel , 

|x| | x | Bee a 


ae = 2S 2 eee (8.9) 
ies | x | 


Other important derivatives are evaluated in the exercises. 


Taylor’s Formula 


To approximate arbitrary functions of position by simpler functions, we need 
Taylor’s formula expressed in terms of the directional derivative. To this end, 
write 


G(t) = F(x + at). 


Then 
oe) = $ <( + at) a a-V F(x), 
a ) = (a-V) (a-VF(x)) = (@'V)'F(X), 
TOO) = (a-v)F(x). 


From the Taylor expansion of G(t) taken about t = 0 and evaluated at t = 1, 
we get 


1 _ 1 G0) 


G(1) = G(0) + je aie 


my 
=e dr‘ 
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Expressed in terms of F, this gives the desired Taylor expansion 


F(x + a) = F(x) + a'V F(x) + aur E(x) 
ae = (a: Vv) PAV. 
= Py ki F(x) = e* ‘F(x). (8.10) 


Note that this leads to a natural definition of the exponential function e* Y of 
the differential operator a-V. 


The Differential 


The Taylor expansion (8.10) reveals a property of the function a-VF(x) of 
such importance that it deserves a special notation and a name. We will use 
the term differential as an alternative to the term directional derivative, and we 
introduce the notation 


F(a, X)=a-V F(x). (8.11) 


This notation is intended to emphasize the fact that Ff’ = F’(a, x) is a function 
of two variables obtained from F = F(x) by differentiation. When the depen- 
dence on a with x held fixed is of interest, it is convenient to write F” = F(a), 
which still reminds us that this function was obtained from a function F. It 
must be noted that the differential 1s a linear function of its first variable, 
which is to say that it has the properties 


F'(a+ b) = F'(a) + F'(b), (8.12a) 
F’ (Aa) = AF'(a) (8.12b) 
for scalar A. Here we have merely written Equations (8.2a, b) in a different 


notation. 
Now suppose that we are interested in the behavior of a function F = F(x) 
in the neighborhood of a point x, + r, a Taylor expansion about x, gives 


F(x) F(x, +r) = Fx) + r-VA(x,) + 2(r-V)° Fx,) 


= F(x) + || V(x) + EE (evy'eta) + 
eu r - k 
+... (rvs Fixer . 
For | rj = |x-—.x, | sufficiently small, the first two terms approximate F(x) to 
any desired accuracy, and we can write 
F(x) — F(x,) = F'(x-x,) = F'(x)- F'(x,)- (8.13) 


Thus, we see that the differential provides a linear approximation to any 
differentiable function. Since linear functions are simple enough to be 
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analyzed completely, this establishes the great importance of the differential. 

Although we use the terms ‘differential’? and ‘directional derivative” 
interchangably, they have different historical roots and emphasize different 
aspects of the same function. If a is a fixed vector, the term “directional 
derivative” is most appropriate to emphasize that a-VF is the derivative in a 
particular direction, the direction of a. On the other hand, the term “‘differen- 
tial’? serves to emphasize that F” (a) is a linear function of a. Unfortunately, 
the term “‘differential” is commonly taken to connote a ‘‘small quantity’’, 
especially in the older literature. It must be realized that the differential F’ (a) 
is defined for all values of a, not just small ones. However, as we have seen in 
obtaining (8.13), the differential F’(x — x,) may be a good approximation to 
F(x) only if | x — x, | is sufficiently small. 


Variation on a Curve 


In mechanics we will often be interested in how some function of position 
F = F(x) varies along the path of a particle x = x(t). Strictly speaking 
x = x(f) is a parametric equation for the particle path and not the path itself, 
which is a set of points, but we suppress such distinctions when they are not at 
issue. To describe variation the variation of F along the path we must 
differentiate the composite function F = F(x(t)). The derivative can be re- 
duced to derivatives of F(x) and x(t) by proving that 


S (x() =F ate + 0) | = © (x(t) + rk(9) | 


It will be left to the interested reader to fill in the missing mathematical details 
of the proof. The last term is seen to be the directional derivative, so we can 
write 


<(x(0) = x(t):V F(x) (8.14) 


x=x(f) ; 


This is the chain rule for the composite function F = F(x(t)). 


Partial and Total Derivatives 


More generally, let F = F(x, t) be a function of position x as well as time f. 
For fixed x, the time derivative is denoted and defined by 


F(x, t + At) — Fag 


e (8.15) 


F(x, t) = jim, 


This describes how F varies with time at each point x. The derivative @, F is 
called the ‘partial derivative of F with respect to time.” If d,F = 0, then F is 
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said to be static or “constant in time” or ‘‘not an explicit function of time”, 
and we write F(x, t) = F(x). On the other hand, if a-VF = 0 for all directions 
a and all points x, then F is said to be uniform or “uniform in space’, and we 
can write F(x, t) = F(t). Obviously if F = F(x, f) is both static and uniform, 
then dF/dt = 0 on any path x = x(). In this case, F is said to be constant. 

To describe how F varies along a particle path x = x(t), we must differen- 
tiate F = F(x(t), t). Using (8.14) and (8.15), we get 


*(x(0), t) = 0,F(x(d), 2) + X(d)-AF(x(d), 0). (8.16a) 


Notice that we have two terms here, because we have used a generalization of 
the product rule (7.6b) which allows us to separately differentiate the two 
distinct functional dependencies on time. Our notation enables us to suppress 
the variables in (8.16a) without confusion and write 

dF ‘ ; 

a CF > Va (0, + VE, (8.16b) 
The derivative dF/d¢ is given many different names in the literature; the term 
total derivative is one of the most common, but the term convective derivative 
is most appropriate, because it suggests change in a function with respect to 
flow along a path x = x(f). 


Line Integrals 


Having seen how scalar derivatives are related to directional derivatives, we 
now consider how integrals with respect to a scalar parameter are related to 
integrals in space. 

Let (’ be a smooth curve in ¢’, from point a to point b, and let F = F(x) bea 
multivector-valued function defined at each point x on (’. The line integral of 
F on C is defined by 


i F(x) dx = lim S F(x;)Ax;. (8.17) 
C Ax 1 


720 j= 
n— « 

The limit can be understood with reference to Figure 8.1. Points x, = a, x,. x, 
., X, = b are selected on the curve (;; they determine chords Ax, = 
x, — x;_, and the sum on the right side of (8.17). The larger the number of 
points selected and the smaller the | Ax, |, the more closely the sum in (8.17) 
approximates the integral, if the k-vector parts <F(x)), do not vary too 
rapidly along the curve. Although it involves vectors instead of scalars, the 
limit process defining the line integral (8.17) is formally the same as the one 
defining the scalar integral in elementary calculus. Indeed, if the curve (°’ is 
represented parametrically by the equation x = x(f) with x(a) = a and x(f) = 

b, then it is not difficult to prove that 
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i Fare i : F(x() *() dt. (8.18) 
Gi a f 


We could have used (8.18) as a definition of the line integral in terms of the 
scalar integral already studied. However, definition (8.17) has the advantage 
over (8.18) of being completely independent of any parametric representation 
of the curve, giving it a certain conceptual and computational simplicity. For 
example, when F(x) = 1, we can easily evaluate the sum in (8.17), and noting 
that it is independent of n (see Figure 8.1), we get the result 


| dx=boa (8.19) 


b-—a 


Fig. 8.1. Approximation of a curve C by line segments. 


For any closed curve the end points are identical, and the result can be written 


pax = 0. (8.20) 


This should be interpreted a sum of vectors adding to zero. 

Because of the relation (8.18), the general properties of the line integral are 
so easily determined from those of the scalar integral that it is hardly 
necessary to write them down. However, some comments on notation and 
some words of caution are in order. When the relevant variables and the 
domain of integration are clear from the context, they can be suppressed in 


the notation for integrals. For instance, (8.18) can be written in the abbrevi- 
ated form 


To indicate the endpoints of the line integral we may write 


b 
| F dx; 
a 
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this notation is especially apt when the function F is such that the value of the 
integral is independent of the path (curve) from a to b. This is obviously the 
case for the integral (8.19), which accordingly may be written 


b 
| dx = b-a. 
a 


It must be remembered, however, that in general the value of the line integral 
depends on the entire curve (' and not on the endpoints alone. The caution 
about ordering factors deserves repeating as well; it must be realized that the 
integral fdx Fis not necessarily equivalent to fF dx unless the product F dx 
is commutative, as when F is scalar-valued. 


Line Integrals of Vector Fields 


For a more specific example of a line integral, let f = f(x) be vector-valued 
function on the curve ©. Since dx is also vector-valued we have 


rox [ras + | taax (8.21a) 


This integral has a scalar part 


[roc=s [eax +e] dx f, (8.21b) 


and a bivector part 


[trax = fs (f dx — dx f). (8.21c) 


Both parts are called line integrals and, because they are of different grade, 
they can be considered separately. But there are times when they are best 
considered together as in (8.21a). 

Next to constant f, the simplest vector-valued function is f(x) = x, with the 
line integral 


[x dx = [xa + [rads (8.22) 


Both the scalar and bivector parts of this integral are of independent interest, 
so let us consider them separately. First, the scalar part. If we represent the 
curve parametrically by x = x(t), then according to (7.8), 


dx dx: 
dt 2 ok 


So 
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But we are at liberty to choose the scalar x° as our parameter, because there 1s 
no contribution to the integral from portions of the curve where x” is constant. 
Hence we get the specific result 


i oe | * a(x?) eis). | (8.23) 


Since its value depends only on the endpoints, this integral its path- 
independent. 


The Area Integral 


Now consider the bivector part of (8.22) written in the form 


b 
az | XAdx. (8.24) 
a 


The significance of this integral can be understood by approximating it by a 
sum in accordance with the definition (8.17). 


i 
A= > DXA AX, 


= + XyAX, +4 SA etek (8.25) 


As illustrated in Figure 8.2, each term in this sum is the directed area of a 
triangle with one vertex at the origin. The first term +x,Ax, approximates the 
directed area “swept out” by the line segment represented by vector variable 
x as its tip “moves” continuously along the curve from x, to x, while its tail is 


Fig. 8.2. Polygonal area approximation. 


Directional Derivatives and Line Integrals Lis 


anchored at the origin. Therefore, the sum in (8.25) approximates the 
directed area swept out as the variable x moves from a to b. Accordingly, we 
arrive at the exact interpretation of the integral (8.24) as the total directed 
area ‘‘swept out” by the vector variable x as it moves continuously along the 
curve from a to b. This interpretation makes it obvious that the value of the 
integral is not path independent, because the area swept out depends on the 
path from a to b. 

If the curve is represented by the parametric equation x = x(t) with 
x(0) = a, then the area swept out can also be expressed as a parametric 
function A = A(t) by writing (8.24) in the form 


1 x(t) i t é 
A() => xAdx=> | xaxde. (8.26) 


x(0) 0 
Differentiating with respect to the upper limit of the integral, we get 
A=xax, (8.27) 


expressing the rate at which area Is swept out. This rate depends on the choice 
of parameter ¢, although the total area swept out depends only on the curve. 

Consider a closed curve C in a plane enclosing the origin, as shown in 
Figure 8.3a. The line integral 


A=; f xanax (8.28) 


along © gives us the directed area ‘‘enclosed” by the curve. Its magnitude 
| A | is the conventional scalar measure of area enclosed by the curve. This 
should be evident from the fact that the vector x “‘sweeps through” each of 
the points enclosed by the curve exactly once, or again by considering 
approximation of the integral by the areas of triangles as expressed by (8.25), 
which applies here with x, = x,,. As in Section 1.3, we represent the unit of 
directed area for the plane by a bivector i. Then A = Ai, where A = | A | if 
the curve (; has a counterclockwise orientation (as it does in Figure 8.3a), or 
A =-|A| if C has a clockwise orientation. For the situation depicted by 
Figure 8.3a, we have 


i (8.29a) 


for the kth “element of area’, whence from (8.27), 


1 1 
a X,AAX, = 50 | x,AAX,; 


Aa “ | xadx |. (8.29b) 


It must be emphasized that (8.29b) follows from (8.27) only when all coplanar 
elements of area have the same orientation, as in (8.29a). This condition is 
not met if the curve © is self-interesectering or does not enclose the origin. 

The area integral (8.27) is independent of the origin, in spite of the fact that 
the values of the vector variable x in the integrand depends on the origin. To 
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ff 
tr 
Fig. 8.3b. 


Area swept out by radius vector along a closed curve. Crosshatched region is swept 
out twice in opposite directions, so its area is zero. 


understand how this can be, displace the origin from inside the curve © shown 
in Figure 8.3a to a place outside the curve as shown in Figure 8.3b. Choosing 
points a and b on ©, as shown in Figure 8.3b, we separate ( into two pieces (, 
and C©,, so the area integral can be written 


=+ } xadx = a xadx + | xadx. (8.30) 
; Cy Cy 


Referring to Figure 8.3b, we see that the coordinate vector sweeps over the 
region inside © once as it goes from a to b along ©-,, but it sweeps over the 
region to the left of ©, twice, once as it traverses ©, and again as it traverses 
(;,; since the two sweeps over the latter region are in opposite directions the 
directed area they sweep out have the same magnitude but opposite sign, so 
their contributions to the integral (8.30) cancel, and we are left with the 
directed area enclosed by ©, as claimed. 

For a general proof that the closed area integral (8.27) is independent of the 
origin, we displace the origin by an ‘‘amount”’ c by making the change of 
variables x > x' = x-—c. Then, 


; x’Adx’ = tc — c)Adx = f)xad —CA pax 


But the last term vanishes because $ dx = 0, so the independence of origin 
is proved. Note that the vector c is entirely arbitrary, so our restriction of the 
origin to the plane of the curve in Figure 8.3b is quite irrelevant to the value 
of the area integral, though it helped us see how parts of the integral cancel 
when the origin is not enclosed by the curve; such cancellation occurs even 
when the origin is outside the plane, as our proof of origin independence 
imphies. 

Our discussion shows that the integral (8.27) generalizes the ancient con- 
cept of the area enclosed by a simple closed curve to a concept of directed 
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area determined by an arbitrary closed 
curve. Thus, the integral (8.27) defines an 
“enclosed” area even for self-interesecting 
plane curves such as the one shown in 
Figure 8.4; the sign of the area integral for 
subregions is indicated in the figure, with 
zero for subregions which are ‘“‘swept out”’ 
twice with cancelling signs. The integral 
(8.27) also applies to closed curves in space 
which do not lie in a plane, but we will not 
need to consider its significance further in 
this book. 


> 


Fig. 8.4. Directed area of a self- The General Line Integral 
intersecting closed plane curve. Verti- 


cal and horizontal lines denote areas Our definition (8.17) of the line integral is 
with opposite orientation, so cross- pot the most general one possible, though it 
Neate prone meenen ea will suffice for most purposes of this book. A 

word about the general case may be helpful. 
Consider a multivector-valued function L(a, x) of two vector variables which 
is linear in the first variable. The line integral of L(a, x) along some curve (: is 
defined as in (8.17) by 


[to x) oli > DAx,. x). (8.31) 
fr Ax j=1 
nao 


The linearity of the first variable in L(a, x) is necessary for the limit in (8.31) 
to be independent of the subdivision of the curve. Equation (8.17) is now 
generalized to 


[ne x) = i, E(K(E), X(L)) GE. (8.32) 


We consider but one special case of such an integral. Suppose that, at every 
point in some region % containing (’, the function L(x, x) is the differential of 
some function F = F(x). Then we write 


L(x, x) = x: VF(x), (8.33) 
and using (8.16) in (8.32), we get 


i. aren aea, = (e oat = Rb) Fay, (8.34) 


Since the value of the integral is completely determined by the value of F at 
the endpoints, the integral is independent of the path in “. Thus we see that 
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for path-independence of an integral it is sufficient that the integrand be a 
differential of some function. The terms ‘‘perfect differential’’ or ‘“‘exact 
differential” are also used to indicate path-independence in the literature. 


The Gradient 


Path-independent integrals arise most commonly in connection with an im- 
portant kind of vector field. If a vector field f = f(x) has the property that 
af = a-V®¢ is the differential of some scalar field @ = (x), then we write 


f=Vo (8.35) 


and say that f is the gradient of @. We say that @ is a potential of f. The 
gradient has a simple geometric interpretation which follows from the fact 
that a-V@ is a directional derivative. The directional derivative tells us the 
rate at which the value of ¢ changes in the direction a. If a is a unit vector with 
direction of our choosing, then a-V@ has its maximum value when a and V@ 
are in the same direction, that is, when a:-V@ = | V@ |. Thus, the gradient 
Vo = V(x) tells us both the direction and magnitude of maximum change in 
the value of @ = (x) at any point x where it is defined. Furthermore, the 
change of @ in any given direction a is obtained by taking the inner product of 
a with V@. 

As shown in Figure 8.5, the equa- 
tion @(x) = k defines a one-parameter 
family of surfaces, called equipotential 
surfaces, one surface for each constant 
value of k. At a given point x, the 
gradient V@ is normal (perpendicular) 
to the surface through that point, and, 
when not zero, it is directed towards 
surfaces with larger values of k. Figure 
8.5 shows only a 2-dimensional cros- 
section. As is conventional, the change 
in k is the same for each pair of neigh- 
boring surfaces, so the separation pro- 
vides a measure of the change in @¢, the 
closer the surfaces, the larger the gra- Fig. 8.5. The gradient vector is orthogonal to 
dient. The change in @ between any the equipotential at every point. 
two points a and b is given by 


b b : 

i dxf = | dx: Vo = $(b) - d(a). (8.36) 

a a 

As we have noted, the value of such an integral is independent of the path. 
For a given scalar function @ = @(x), the gradient is easily found from the 

directional derivative by interpreting a:V@ as the inner product of V @ with an 
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arbitrary vector a. Thus, for @(x) = x-g where g is a constant vector, we get 
a’ V(x-g) = a-g from Equation (8.5), hence 


Vx g=g. (8.37) 
Similarly, from (8.7) and (8.8) we get 

Wx = 2x, (8.38) 

Vase (8.39) 


These formulas enable us to evaluate the gradient of certain functions without 
referring to the directional derivative at all. Thus, if F = F(, x |) is a function 
of the magnitude of x but not its direction, then, by using (8.39) tn connection 
with the chain rule (8.4), we get 


oF 
d|x| 


In NFII, we shall see that the gradient operator V can be regarded as the 
derivative with respect to the vector x. Then we shall see that the directional 
derivative a-V can indeed be regarded as the inner product of a vector a with 
a vector operator V, just as our notation suggests. 


VF =x (8.40) 


2-8. Exercises 


(8.1) Evaluate the derivatives 
a-V(xXb), 
aV(x:<A),), 
a‘V[x-(xab)]. 


where b and A are independent of x. 
(8.2) Ler =a x Xander | =" — x’ | where x’ is a Vector 
independent of x. Verify the following derivatives: 


(a) aVr=art 


puree rraa 
(c) a-V(F-a) = Bet 
(d) a-V(faa) = ——— 
fa | faa | 


(ec) a V|[faal =- ; 
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(8.3) 


(8.4) 


Developments in Geometric Algebra 


el 3(a-f)’ — | Faa 
(aS, Se aa 
r : 
(i) ¢(aV) = = 


(j) aVlogr = 42 


(Koa Vr? =2eaae* 
(ia VE | =a kath): 


In the last two cases, k is any nonzero integer and r # 0 if k < 0. 
Show that the Taylor expansion of (x — a) ' about x is term by term 
equivalent to the series 

1 toa 1 


| Na 
So et ee A as 
. al eX Ko x Xx 


xX 


The series is convergent if | a | < | x |. 
The Legendre Polynomials P,,(x-a) can be defined as the coefficients 
in the power series expansion 


PiAxa) ~ = Pilx-a) 


ATH) 


Nee 


|x-a| n=0 |x |""! n=0 | xX 


Po(xa) | Pia) P,(x:a) P,(x-a) 
| x | | x |’ |x| |x|’ 


The series converges for | a | < | x |. 
Use a Taylor expansion to evaluate the polynomials of lowest order: 


Baya bs 

P,(x-a) = x-a, 

P(x-a) = 3[3(x-aY - 2° x°?] = (x-a)? + +(xaa)’, 

P,(x-a) = 5([5(x-a)’ - 3a° x? x-a] = (x-a)? + 4x-a(xaa)’. 
The P,,(x-a) are polynomials of vectors. Show that they are homo- 
geneous functions of degree n, that is, 


Peay = x |i Gern= | aman Kea) 
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(8.5) Verify the value given for the following line integral: 


b 
| x’ dx = log(a''b), 
a 


taken along any continuous curve in the (aab)-plane which does not 
pass through the origin. Write separate integrals for the k-vector 
parts. Note that the integral is multivalued, and specify conditions 
on the curve which give the principal value of the logarithm. (Recall 
the discussion of the logarithmic function in Section 2-5.) 

Introduce the spinor-valued function z = z(x) = a'x and a para- 
metric equation x = x(t) for the curve and show that the integral 
can be written in the equivalent forms 


b t 3 'h 
[ox ac | Sar |" ee 
a Our: 1 6h 


The last form of the integral is discussed extensively in textbooks on 
“Functions of a Complex Variable’. 


Chapter 3 


Mechanics of a Single Particle 


In this chapter we learn how to use geometric algebra to describe and analyze 
the motion of a single particle. From a physical point of view, we will be 
concerned with constructing the simplest models for a physical system and 
discussing their applications and limitations. From a mathematical point of 
view we will be concerned with solving the simplest second order vector 
differential equations and analyzing the geometrical properties of the solutions. 
Most of the results of Chapter 2 will be used in this chapter in one way or 
another. Of course, the main mathematical tool will be the geometric algebra 
of physical space. Since the reader is presumed to have some familiarity with 
mechanics already, a number of basic terms and concepts will be used with no 
more than the briefest introduction. A critical and systematic analysis of the 
foundations of mechanics will be undertaken in Chapter 9. General methods 
for solving differential equations will be developed as they are needed. 
Although only the simplest models and differential equations are con- 
sidered in this chapter, the results should not be regarded as trivial, for the 
simple models provide the starting point for the analyzing and solving com- 
plex problems. Consequently, the time required for developing an elegant 
formulation and thorough analysis of simple models is time well spent. 


3-1. Newton’s Program 


Isaac Newton (1642-1727) is rightly regarded as the founder of the science 
called mechanics. Of course, he was neither the first nor the last to make 
important contributions to the subject. He deserves the title of ‘founder’, 
because he integrated the insights of his predecessors into a comprehensive 
theory. Furthermore, he inaugurated a program to refine and extend that 
theory by systematically investigating and classifying the properties of all 
physical objects. Newtonian mechanics is, therefore, more than a particular 
scientific theory; it is a well-defined program of research into the structure of 
the physical world. 
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This section reviews the major features of Newtonian mechanics as it stands 
today. Naturally, a modern formulation of mechanics differs somewhat from 
Newton’s, but this is not the place to trace the intricacies of its evolution. 
Also, for the time being we take the fundamental concepts of space and time 
for granted, just as Newton did in his Principia. It will not be profitable to 
wrestle with subtleties in the foundations of mechanics until some proficiency 
with the mathematical formalism has been developed, so we delay the 
attempt until Chapter 9. 

The grand goal of Newton’s program is to describe and explain all proper- 
ties of all physical objects. The approach of the program is determined by two 
general assumptions: first, that every physical object can be represented as a 
composite of particles; second, that the behavior of a particle is governed by 
interactions with other particles. The properties of a physical object, then, are 
determined by the properties of its parts. For example, structural properties 
of an object, such as rigidity or plasticity, are determined-by the interactions 
among its parts. The program of mechanics is to explain the diverse proper- 
ties of objects in our experience in terms of a few kinds of interactions among 
a few kinds of particles. 

The great power of Newtonian mechanics is achieved by formulating the 
generalities of the last paragraph in specific mathematical terms. It depends 
on a clear formulation of the key concepts: particle and interaction. A particle 
is understood to be an object with a definite orbit in space and time. The 
orbit is represented by a function x = x(t) which specifies the particle's 
position x at each time ¢. To express the continuous existence of the particle in 
some interval of time, the function x(t) must be a continuous function of the 
variable ¢ in that interval. When specified for all times in an interval, the 
function x(t) describes a motion of the particle. 

The central hypothesis of Newtonian mechanics is that variations in the 
motion of a particle are completely determined by its interactions with other 
particles. More specifically, the motion is determined by an equation of the 
general form 


f = mx it) 


where x is the acceleration of the particle, the scalar m is a constant called the 
mass of the particle, and the force f expresses the influence of other particles. 
This hypothesis is commonly referred to as ‘“‘Newton’s second law of motion”, 
though it was Euler who finally cast it in the form we use today. 

Newton’s Law (1.1) becomes a definite differential equation determining 
the motion of a particle only when the force f is expressed as a specific 
function of x(t) and its derivatives. With this much understood, the thrust of 
Newton’s program can be summarized by the dictum: focus on the forces. This 
should be interpreted as an admonition to study the motions of physical 
objects and find forces of interaction sufficient to determine those motions. 
The aim is to classify the kinds of forces and so develop a classification of 
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particles according to the kinds of interactions in which they participate. The 
classification is not complete today, but it has been carried a long way. 

Newton's program has been so successful largely because it has proved 
possible to account for the motions of physical objects with forces of simple 
mathematical form. The “forces of nature’? appear to have two properties of 
such universality that they could be regarded as laws, though they were not 
identified as such by Newton. We shall refer to them as the principles of 
additivity and analyticity. 

According to the principle of additivity (or superposition) of forces, the 
force f on a given particle can be expressed as the vector sum of forces f, 
independently exerted by each particle with which it interacts, that is, 


F=f +h t...=26. Gle2) 


This principle enables us to isolate and study different kinds of forces 
independently as well as reduce complex forces to a superposition of simple 
forces. Newton’s program could hardly have progressed without it. 

According to the principle of analyticity (or continuity), the force of one 
particle on another is an exclusive, analytic function of the positions and 
velocities of both particles. The adjective “‘exclusive’’ means that no other 
variables are involved. The adjective ‘‘analytic’’ means that the function is 
smoothly varying in the sense that derivatives (with respect to time) of all 
orders have finite values. The principle of analyticity implies that the force f 
on a particle with position x = x(t) and velocity x = x(t) is always an analytic 
function 


f = f(x, x, 0), (1.3) 


where the explicit time dependence arises from motions of the particles 
determining the force. Notice that (1.3) assumes that the force is not a 
function of X and higher order derivatives. An exception to this rule is the 
so-called ‘“‘radiative reaction force’’ which requires special treatment that we 
cannot go into here. Mathematical idealizations or approximations that 
violate the analyticity principle are often useful, as long as their ranges of 
validity are understood. 

A specific functional form for the force on a particle is commonly called a 
force law. For example, a force of the form f = —kx is called ‘‘Hooke’s Law”. 
It should be understood, however, that there is more to a force law than a 
mathematical formula; it is essential to know the law’s domain of validity, that 
is. the circumstances in which it applies and the fidelity with which it rep- 
resents the phenomena. Much of physics is concerned with determining the 
domains of validity for specific force laws, so it shall be our concern as well 
throughout this book. Therefore, we can only sketch a classification of forces 
here. The major distinction to be made is between fundamental and approxi- 
mate force laws. 

Physicists have discovered four kinds of fundamental forces, the gravi- 
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tational, the electromagnetic, the strong and the weak forces. They are called 
“fundamental”, because every known force can be understood as a superpo- 
sition and approximation of these forces. The strong and the weak forces 
were discovered only fairly recently, because their effects on ordinary human 
experience are quite indirect. The strong forces bind the atomic nuclei; so 
they determine the naturally occurring elements, but they do not otherwise 
come into play in everyday experience. The weak forces govern radioactive 
decay. The strong and weak forces are not as well understood as the electro- 
magnetic and gravitational forces, so they are major objects of basic research 
today. Furthermore, they are mathematically formulated in terms of ‘“‘quan- 
tum mechanics’ which goes beyond the ‘‘classical mechanics” developed 
here. For these reasons, they will not be discussed further in this book. 

The gravitational force of a particle at a point x’ = x’(t) on a particle at 
x = x(f) is given by 


f(x’, 1) = mm'G - BNO) 


Te xo oe 


where m and m’ are masses of the particles and G is an empirically known 
constant. This is Newton’s gravitational force law. Its domain of validity is 
immense. It applies to all particles with masses, and only minute deviations 
from it can be detected in the most sensitive astronomical experiments. For 
practical reasons, we often work with approximations to Newton’s law, but 
the law’s great validity enables us to estimate the accuracy of our approxi- 
mations with great confidence whenever necessary. 
The electromagnetic force on a particle with charge q has the form 


foxx N= aE + x B). (1S) 


where E = E(x, ¢) is an electric field, B = B(x, t) is a magnetic field, and c is a 
constant with value equal to the speed of light. This is commonly called the 
Lorentz force law in honor of the man who first used it extensively to analyze 
the electromagnetic properties of material media. The charge q in (1.5) is a 
scalar constant characteristic of the particle; it can be positive, negative or 
zero; consequently, electromagnetic interactions determine a_ three-fold 
classification of particles into positively, negatively or neutrally charged 
groups. The electric and magnetic fields in (1.5) can in principle be expressed 
as functions of the positions and velocities of the particles that produce them, 
but this is totally impractical in most applications, because the number of 
particles is very large, and often the information would be irrelevant because 
only a simple approximation to the functional dependence on x and f is 
required. 

The known consequences and applications of electromagnetic interactions 
are vastly richer and more numerous than those of the other interactions. 
Research during the last hundred years or so has established the astounding 
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fact that, aside from simple gravitational attraction, all the manifold proper- 
ties of familiar physical objects can be explained as consequences of electro- 
magnetic interactions. This includes explanations of solid, liquid and gaseous 
phases of matter, thermal and electrical conduction and resistance, the 
varieties of chemical binding and reaction, even the colors of objects are 
explained as electromagnetic interactions of particles with light. The full 
explanation requires “‘quantum mechanics’’, but the ‘‘classical mechanics” of 
concern in this book is indispensible, and its resources are far from exhausted. 
In all this the Lorentz force law (1.5) plays a crucial role, so it must be 
regarded as one of the most important mathematical expressions in physics. 

The abstract mathematical framework of Newtonian mechanics does not 
specify any force laws. Newton began the search for ‘“‘force laws of nature”’ 
with the stunning proposal of his gravitational force law, which has served as a 
paradigm for force laws ever since. The fundamental force laws were dis- 
covered by a combination of empirical study and mathematical analysis. The 
importance of mathematical analysis in this endeavor should not be underesti- 
mated, even in Newton’s initial discovery. Textbooks proceed rapidly to 
Newton's law, but Newton spent years preparing himself mathematically. His 
preparatory studies in analytic geometry led him to a complete classification 
of third order algebraic plane curves, which is far beyond what students 
encounter today. The extent of Newton’s mathematical preparation is evident 
in The Mathematic Papers of Isaac Newton, recently published under the 
careful editorship of D. T. Whiteside. Every serious student of physics should 
become acquainted with these splendid volumes. 

In this chapter we will be engaged in specific and general studies of a variety 
of force laws. As will be seen, mathematical analysis leads to a classification 
of force laws according to their mathematical properties. This is of great 
importance of many reasons: (1) It helps us identify common properties of 
many force laws and sytematically zero in on the specific law appropriate in a 
given situation. (2) To understand why one force law is fundamental rather 
than another, we must examine a range of possibilities and learn to dis- 
tinguish the crucial from the unimportant properties of fundamental laws. (3) 
If we hope to refine or improve present laws, we must know what reasonable 
possibilities are available. (4) If we hope to develop a unified theory of the 
fundamental laws, we must know how they differ and what they have in 
common. (5) Finally, mathematical analysis is essential for practical approxi- 
mations and applications of the fundamental laws. 

We have mentioned the important distinction between fundamental and 
approximate laws. In the macroscopic domain of familiar physical objects, we 
deal mostly with approximate laws, because a reduction to fundamental laws 
is impractical. We call them ‘“‘approximate laws”’ for the obvious reason that 
they approximate fundamental laws. They are also called “empirical” or 
‘phenomenological’ laws, because their relation to empirical evidence is 
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fairly direct, though there is usually reason to believe that the relation is 
incomplete or approximate in some respect. 

It is useful to distinguish between Jong range and short range forces. Short 
range approximate forces are also called contact forces, because they are 
exerted by the surface of one body or medium on the surface of another body 
in contact with it. The forces exerted by molecules on the two surfaces have a 
macroscopically short range. The resultant force supposed to act on either 
body is necessarily approximate, because so many molecules are involved. As 
examples of contact forces, we list friction, viscosity and bouyant forces. Long 
range approximate forces are also called body forces, because they are 
exerted on particles throughout a macroscopic body. The gravitational force 
exerted by the Earth is the most familiar force of this kind. 

Once the force law has been determined, the problem of determining the 
motion of a particle is a strictly mathematical one. According to (1.1) and 
(1.3), the equation of motion necessarily has the form 


mx = (x, x; t). (1.6) 


This is called a second order differential equation, because it contains no 
derivatives of the dependent variable x with respect to the independent 
variable ¢ of order greater than the second. Books on the theory of differential 
equations prove that if f is an analytic function, then (1.6) has a unique 
general solution depending only on two arbitrary vector constants (‘‘vector’’, 
because the dependent variable is a vector.) Designating these constants by a 
and b, the general solution of (1.6) can be written as a function of the form 


x = x(t, a, b). C7) 


If desired, the constants can be determined from initial conditions, that is, the 
position x, and the velocity v,, of the particle at time ¢ = 0; thus from (1.7) we 
get the equations 


x(0, a, b) = x,, 
x(0, a, b) = v,.- (1.8) 


Being two simultaneous vector equations in two vector variables, these 
equations determine a, b from x, and v, or vice-versa. 

We shall have many occasions to use this important general theorem, in 
particular, when solving the equations of motion. There are many ways to 
solve differential equations, but, whichever way we use, we will know that we 
have found all possible solutions if we have found one solution depending on 
two arbitrary independent vector constants. By assigning these constants 
appropriate values we determine a unique particular solution, for example, 
the one with given initial values x,,, v,. It follows from the general theorem 
that if the position and velocity of a particle subject to a known force are 
specified at any time, then the position and velocity at any subsequent time 
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are uniquely determined. For this reason, the position and velocity are 
commonly called state variables and are said to designate the state of a 
particle. 


3-2. Constant Force 


In this section we study the motion of a particle subject to a constant gravi- 
tational force f = mg. Of course, our results describe the motion of a particle 
with charge g in a constant electric field E just as well; it is only necessary to 
write g = (q/m)E for the electrical force per unit mass. 

According to the fundamental law of mechanics, a particle subject to a 
constant force undergoes a constant acceleration. For a force per unit mass g, 
the particle trajectory x = x(t) is determined by the differential equation 


kK=v=g al) 
subject to the initial conditions 

x(0) = v(0) = v,, (2.2a) 

x(0) = x. (2.2b) 


Using (2-7.19) and (2-7.17), Equation (2.1) can be integrated directly to get 
the velocity v = v(f) at any time f¢; thus, 


kX =v=gt+ W. (2.3) 
A second integration gives 
r=x—x, = 5g? + Vol. (2.4) 


This is a parametric equation for the displacement of the particle as a function 
of time. The trajectory is a segment of a parabola, as shown in Figure 2.1. The 
solution (2.4) presents the parabolic motion of the particle as the superpo- 
sition of two linear motions. The term y,f can be interpreted as the displace- 
ment of a point at rest in a reference system moving with velocity v, , while the 
term > g/? is interpreted as the displacement of the particle initially at rest in 
the moving system. Accordingly, the constants in (2.4) determine a ‘‘natural”’ 
coordinate system for locating the particle at any time; the origin is deter- 
mined by x,, while coordinate directions and scales are given by the vectors v, 
and g. This is a skew coordinate system, but the particle’s natural position 
coordinates t and + gt? are obviously quadratically related, just as they are in 
the familiar representation by rectangular coordinates. Mathematically speak- 
ing the quantity + gf? is a particular solution of the equation of motion (2.1). 
while v,f + x, is the general solution of the homogeneous differential equa- 
tion x = 0. 

The description of motion can be simplified by representing it in velocity 
space instead of position space. A curve traced by the velocity v = v(t) is 
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Fig. 2.1 The trajectory in position space. 


called a hodograph. According to (2.3), the hodograph of a particle subject to 
a constant force is a straight line. In velocity space the location of the particle 
can be represented by the average velocity V = V(t). The parametric equation 
for V is obtained by writing (2.4) in the form 


= + gt ap Wage 


vat 
t 


set 


W 


Fig. 2.2. The trajectory in velocity space. 


Projectile Range 


(2.5) 


As illustrated in Figure 2.2, the trajectory 
V(t) is simply a vertical straight line begin- 
ning at v,. Figure 2.2 also shows the simple 
relation of the average velocity (2.5) to the 
actual velocity (2.3), a relation which is 
disguised if the equation for displacement 
(2.4) is used directly. Figure 2.2 contains 
all the information about the projectile 
motion, so all questions about the motion 
can be answered by solving the triangles in 
the figure, graphically or algebraically, for 
the relevant variables. Let us see how to do 
this efficiently. 


Although the displacement r = | r | is represented only indirectly in velocity 
space, it is nonetheless easy to compute. Consider, for example, the problem 
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of determining the range r of a target sighted in a direction fF, which has been 
hit by a projectile launched with velocity v,. This problem can easily be solved 
by the graphical method illustrated in Figure 2.2, assuming, of course, that 
the flight of the projectile is 
adequately represented as that 
of a particle with constant 
acceleration g. The method 
simply exploits properties of 
Figure 2.2. Having “laid out” 
v, On graph paper, as indicated 
in Figure 2.3, one extends a 
line from the base of v, in the 
direction f to its intersection 
with a vertical line extending 
from the tip of v,. The lengths 
of the two sides of the triangle 
thus constructed are then mea- 
sured to get the values of yet Fig. 2.3. Graphical determination of the displace- 
and r, t, from which one can ment r, time of flight ¢ and the final velocity v. 
compute r and t as well, if de- 
sired. Figure 2.2 also shows how the construction can be extended to deter- 
mine the final velocity v of the projectile. Angles with the horizontal are 
indicated in Figure 2.2, since they are commonly used to specify relative 
direction in practical problems. 

The same problem can be solved by algebraic means. Outer multiplication 
of (2.5) by r produces 


1 
yIgar = FAV. 
Hence 


2r ) 2 0 r v., 
ee ee (2.6) 
gAr is ZAP | 


This completely determines ¢ from the target direction f and the initial 
velocity v,. We can get one other relation from (2.5) by “wedging” it with gr, 
namely 


ZAM = IZAV,. (277) 
We can solve this for r and use (2.6) to eliminate rf, thus 
, aS ZEA 4 2p) (vast) (2.8) 
gar (gat) | gat |° 


Thus, the range r has been expressed as an explicit function of the given 
vectors v,, F and g. 
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Maximum Range 


The algebraic expression (2.8) for the range supplies more information than 
the corresponding graphical construction. For example, for a fixed ‘“‘muzzel 
velocity” v, and target direction f, Equation (2.8) gives the range r as a 
function of the firing direction ¥,. The variation of r with changes in ¥, is 
unclear from the form of (2.8), because an increase of fav, will be ac- 
companied by a decrease of v,af. so their product might either increase or 
decrease. The functional dependence is made more obvious by using the 


identity 
2(ZAV,)*(VoAP) = Vo [B-F-8-(¥, FY,)]. (2.9) 


established in Exercise (2-4.8d). The first term on the right side of (2.9) is 
constant, while the second term is a function of the vector V,F¥, (the reader is 
invited to construct a diagram showing the relation of this vector to f and ¥,). 
The direction of ¥, which gives maximum range 1s obtained by maximizing the 
value of (2.9). It is readily verified that ¥,fV, is a unit vector, so (2.9) has its 
maximum value when 


—£:(¥,FV,) = 1. 
From this we can conclude that 
“8 = Hf%,, 
or equivalently, 
-£V, = VF. (2.10) 


This equation tells us that the angle between —g and ¥,, must be equal to the 

angle between Ff and ¥,,. Thus, the vector ¥, bisects the angle between F and -¢ 
(Figure 2.4), so we can express ¥, as a function of f and g by 

rg i (2.11) 

|t-¢% 

Substituting (2.11) in (2.8), one immediately 

finds the following expression for the maximum 


range: 
3g Due 1 
Vn a = “A A zs 
aX g FS = @ Ee 
f : 
U; 1 
Fig. 2.4. For maximum range, = os =a 12) 


shoot in a direction bisecting the 
vertical -g and the line of sight. We gaw in Section 2-6 that this is an equation for 

a paraboloid of revolution; here, it expresses 
as a function of the target direction f. As illustrated in Figure 2.5, the 


Mn ax 
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Directrix 


Focus of envelope 


Fig. 2.5. The envelope of all trajectories with the same initial speed is a paraboloid of 
revolution [Figure redrawn from W. D. MacMillan, Theoretical Mechanics, reprinted by Dover, 
INGaY a@l959)|e 


paraboloid (2.12) is the envelope of all trajectories emanating from the origin 
with the same initial speed v,. Thus, only points under the paraboloid can be 
reached by a projectile. 

The relation between ¢ and r in (2.7) can be expressed differently by 
eliminating gar between (2.6) and (2.7) to get 


rAV, 


P= (2713) 


N]— 


BAY, 


For maximum range, one sees from either (2.10) or (2.11) that Fav, = 
@Av,, in which case (2.12) and (2.13) give 


(2.14) 


So far all results have been obtained by analyzing the upper triangle in 
Figure 2.2. The lower triangle contains additional information. The infor- 
mation in both triangles can be represented in a symmetrical form by extend- 
ing Figure 2.2 to a parallelogram, as shown in Figure 2.6a. The parallelogram 
can be characterized algebraically by the equations for its diagonals; 


—V, = gt, (2.15) 
oe oe = 2. (2.16) 


These are, of course, equivalent to the basic equations (2.3) and (2.4). 
Multiplying equations (2.15) and (2.16) to eliminate ¢, one obtains 
2er = (v-v,)(v + v,) = Uv — U, + 2VAV,. (2.17) 
Therefore, 
2g-r=vU'- Vz, (2.18) 
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gar = VAY,. (22319) 


Equation (2.18) will be recognized as expressing conservation of energy. It 
determines the final speed v = | v | of the particle in terms of initial data. On 
the other hand, Equation (2.19) determines the relative direction of initial 
and final velocities. 

A glance at Figure 2.5 suggests that, there are two distinct trajectories with 
initial speed u, passing through any given point r within the maximum range. 
The trajectories can be ascertained from equations of the last paragraph. Let 
v, and v,’ be their initial velocities. By assumption the initial speeds are the 
same; 


| Vo | =| vo’ | = vp. (2.20a) 
By (2.18), the final speeds are the same; 
ly | Sv" | =v: (2.20b) 


And by (2.19), the area of the parallelogram determined by v’ and vj, is equal 
to the area of the parallelogram determined by vy and v,; 
VAV, = V AV»; (2.20c) 


Hence the two parallelograms are similar. They are illustrated in Figures 2.6a 
and 2.6b. Since corresponding diagonals have equal length, it follows at once 
that 


; 2r 
fas . (2:21) 
& 
Vy 
VY 
v’ 
Vv 
Fig. 2.6a. Fig. 2.6b. 


This relation can be used to determine one of the two trajectories from the 
other. The problem is to find the direction vi, from the directions Ff and ¥,,. 
This can be done by using (2.7) to get the relation 
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'ZAV, = ZA = IPAv,. 


Eliminating | v,’ | and t’ with (2.20a) and (2.21), we get 


BAU = FAL. (2.22) 


This says that the angle ¥,, makes with the vertical is equal to the angle 
between ¥,, and f. Equation (2.22) generalizes the relation (2.10) which we 
found for trajectories of maximum range. 

For the case of maximum range, Equation (2.21) must agree with (2.14); 
hence ¢ = ¢’, and the two parallelograms in Figure 2.6 reduce to a single 
rectangle. Then v-v, = 0, and (2.19) can be solved for the final velocity: 


(ZAr)°v, 


YAR) eee 
Up 


(2.23) 


All the significant properties of a trajectory with maximum range have now 
been determined. 


Rectangular and Polar Coordinates 


For some purposes it may be convenient to express the above results in terms 
of rectangular or polar coordinates. The positive vertical direction is rep- 
resented by the unit vector -g = -g/g. The vertical coordinate y of the particle 
is then defined by 


y=-6r, (2.24) 
while its horizontal coordinate x is defined by 

gar =i| gar| = ix, (2.25a) 
or 

c= lear | =0: C25b) 
These relations are more useful when combined into the single set of equations 

or = —y + ix = i(x+iy) = ire = reX* *”), (2.26) 
Similarly, for the initial velocity v, one writes 

BV, = —Up, + id, = i(Ug, + iUo,) = ive”. (2.27) 


As illustrated in Figure 2.7 the angles @ and @ measure inclinations of f and ¥,, 
from the horizontal direction specified by the vector §i. 
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Equations (2.26) and (2.27) compactly 
describe how rectangular and polar 
coordinates are related to the vectors g, 
v, and r. From them one can read off, 
for example, the relations 

gar = ir cos @, 

fAV, = iv, cos 0. 


Moreover, 


Fig. 2.7. rv, = (rg)(8v,) = rue". (2.28) 


This makes it easy to put the range 
equation (2.8) in the more conventional form 


; a | 20, | cos @ sin (@-6) (2.29) 
g cos’ 


Similarly, (2.7) can be put in the form 


eoqnquae lS OY (2.30) 
Vit 

This formula can be used to find the firing angle 6 when the location of the 
target is given. The time of flight ¢ can be evaluated by using the result of 
Exercise (2.1) below. In a similar way, other multivectors equations in this 
section can be put in conventional trigonometric form. But it should be 
evident by now that the multivector equations are usually easier to manipu- 
late than their trigonometric counterparts. 


3-2. Exercises 


(Zeal) Derive the following expression for time of flight as a function of 
target location: 


i= ~ +prt [(v . gr)’ a a Ws 


(2.2) From Equation (2.4) one can get a quadratic equation for f, 
t? + 2v,-g'¢t—-2r-g' = 0. 


Discuss the significance of the roots and how they are related to the 
result of Exercise 1. 

(2.3) The vertex of a parabolic trajectory is defined by the equation 
v-g = 0. Show that the time of flight to the vertex is given by 
t = -v,g''. Use this to determine the location of the vertex. 
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(2.4) 


(2.5) 


(2.6) 


(2.7) 


Mechanics of a Single Particle 


Use equations (2.18) and (2.19) to determine the maximum hori- 
zontal range x for a projectile with initial speed v, fired at targets on 
a plateau with (vertical) elevation y above the firing pad. 

Find the minimum initial speed u, needed for a projectile to reach a 
target with horizontal range x and elevation y. Determine also the 
firing angle @, the time of flight ¢ and the final velocity v of the 
projectile. Specifically, show that 


+ il 
v = [g(r-y)]'°. v, = [g(rt+y)]'*. tan 0 = Ee 


From Equations (2.15) and (2.16) obtain 
VAS = WAS, 
VAT = FAV). 


Solve these equations to get 


VAP VA 
v= (= aa ee | 
rag rag 
Determine the area swept out in time f by the displacement vector 
of a particle with constant acceleration g and initial velocity v,,. 


3-3. Constant Force with Linear Drag 


We have seen that the trajectory of a particle subject only to a constant force 
mg is a parabola. We now consider deviations from parabolic motion due to a 
linear resistive force, that is, a resistive force directly proportional to the 
velocity. Expressing the resistive force in the form —myv, where y is a positive 
constant, the equation of motion can be written 


vV=g-yv. (3.1) 


This differential equation is most easily solved by noticing that e” is an 
integrating factor*. Thus, 


e"(v + yv) = (ve) = ge”. 


This integrates directly to 


t yt 


Solving for v = v(t), we have 


“For a general method of determining integrating factors without guessing, see Exercise (2.ae 
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(1-e”) 


v=g vier. (3.2) 
The constant y ' is called a relaxation time; it provides a measure of the time it 
takes for the retarding force to make the particle ‘‘forget’’ its initial con- 
ditions. If t >> y', then e ” < 1, so no matter what the value of v,,, the first 
term on the right side of (3.2) eventually dominates all others; then we have 


View (a= ys. (3.3) 
This value of the velocity is called the terminal velocity. As the particle 
approaches the terminal velocity its acceleration becomes negligible. Indeed, 
(3.1) gives the terminal velocity directly when the term v is regarded as 
negligible. 

The displacement r = x — x, of the particle from its initial position is found 
by integrating (3.2) directly. The result is 


~yt ete as, — pl 
pogo) + ye (3.4) 


From the two Equations (3.4) and (3.2), properties of the trajectory can be 
determined by algebraic means. With the initial conditions x,, v, specified, 
the general shape of the trajectory can be determined by locating the vertical 
maximum and asymptote. The location of a vertical maximum is determined 
by the condition g-v = 0. (Exercise (3.1)). To locate the asymptote, we 
deduce from (3.4) that the horizontal displacement of the particle is 


x(t) = | gar | = y' | gav, | (1-e7”), 
with a maximum 


x(~) = y'| av, |. (3.5) 


Figure 3.1 compares trajectories of 
particles with the same initial velocities 
subject to resistive forces of different 
strengths. Asymptotes are not shown 
because they do not fit on the figure. 
(See Exercise (3.2) for analysis.) 


Time of Flight 


Fig. 3.1. Comparison of trajectories with dif- The time of flight to a target specified 
fering resistance. The parameter 7 here is the by v, and f can be determined by the 
i i i for the para- : 
time of flight to vertical maximum for the para same general method used in the para- 
bolic trajectory. ‘ : pet 

bolic case. The range | r | is eliminated 
from (3.4) to produce an equation for ¢. To this end, take the outer product of 
(3.4) with y’f to get 

Fag(e”’ + yt- 1) + yav, (l-e”) = 0. 
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Divide this by fag and introduce the notation 


ge (3.6) 
gar 
to get 
yt=(1+7yT) (1-e”). (3.7) 


Comparison with (2.6) shows that the scalar T defined by (3.6) is precisely the 
time of flight in the parabolic case. It should be noted that 7 is completely 
determined by the target direction f and the initial velocity v,. 

Equation (3.7) is a transcendental equation for ¢ in terms of 7. Fort < y', 
we can get an approximate solution of the equation by expanding the 
exponential; thus 


yee Lt 2yl) Gta av eee) 
Dividing by +y? ¢ (1 + + yT) and rearranging terms, we get 
£ 
i= a 
jeer mea) 27 


As it should, this equation reduces to tf = T when y = 0. The first order 
approximation of this result is obtained by replacing ¢ by 7 on the right of 
(3.8) and using the binomial expansion of the first term to first order; thus, 


t= T(1-yyT) + YT’ 


+ Fy0?, (3.8) 


OF, 
f= Fe) (3.9) 


The time of flight computed here ts less than in the parabolic case, because 
the range is less, though the target direction is the same. 


Range 


We can estimate the range by expressing it as a function of t, just as we did in 
the parabolic case. Taking the outer product of (3.4) with g and solving for 
r=|r|, we get 


saul i = 


r= 


gar 4 
Using (3.9) to estimate the time-dependence to first order in y, we find 
l-e” 


=t-z7yt = T1-¢yT) -tyT’ 
= T(1-7FyT). 
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So 


ge> 


AV, 


T(1-FyT). 


r= 2 
gar 

Comparison with (2.8) shows that 

__ 2ViABEAY,, 


ay See (3.10a) 


R=— - 
Ar | gar 


is identical to our previous expression for range in the parabolic case. 
Consequently, it is convenient to write our range formula in the form 


5 4y TAv, 
= = — —_—— Alt 
r=R(1-7yT) R{ | calar: | (3.10b) 


showing the first-order correction to the range in the parabolic case. 


Ohm’s Law 


The equations we have been discussing are useful in the analysis of micro- 
scopic as well as macroscopic motions. For example, consider an electron 
(with mass m and charge e) moving in a conductor under the influence of a 
constant electric field E. The electron’s motion will be retarded by collisions 
with atoms in the conductor. We may attempt to represent the retardation by 
a resistive force proportional to the velocity. If the resistance is independent 
of the direction in which the electron moves, we say that the conductor is an 
isotropic medium, and we can write the resistive force in the form —yv, where 
uw is a constant. We are thus led to consider the equation of motion 


mv = eK — pv. Gay 


For times large compared with the relaxation time 


pa . (3.12) 
u 
the electron will reach the terminal velocity 
a= . (3.13) 
u 


and there will be a steady electric current in the conductor. The electric 
current density J is given by 


J = Nev, (3.14) 


where N is the density of electrons. Substituting (3.13) into (3.14), we get 
Ohm’s Law 
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J = o£, (3.15) 
where the conductor’s d-c conductivity o is given by 
eo NG (3.16) 
u 


Ohm’s law holds remarkably well for many conductors over a wide range of 
currents. The conductivity o and the electron density N can be measured, so u 
can be calculated from (3.16). Then the relaxation time t can be calculated 
from (3.12) and compared with measured values. The results are in general 
agreement with the extremely short relaxation times found for metals. Thus, 
our selection of Equation (3.11) is vindicated to some degree, and we have 
come to understand Ohm’s law as something more than a mere empirical 
relation. But we can hardly claim to have a satisfactory explanation or 
derivation of Ohm’s law, because our understanding of (3.11) and its domain 
of validity is too rudimentary at this point. For one thing, the velocity v can 
certainly not be seriously regarded as the velocity of an individual electron. It 
must be interpreted as some kind of average electron velocity. The trajectory 
of an individual electron must be very irregular as it collides repeatedly with 
the much more massive atoms in the conductor. Our equations describe only 
average motion in the microscopic domain. Derivation and explanation of 
these equations requires statistical mechanics and equations governing the 
submicroscopic motion of electrons, specifically, the basic equations of quan- 
tum mechanics. This much is certain: Ohm’s law is not a fundamental law of 
physics, it is a macroscopic approximation to complex processes taking place 
at the atomic level. 


3-3. Exercises 


(3.1) Show that if g-v, <0, then a particle subject to Equation (3.1) 
reaches a maximum height in time 


t,, = y' log( - vy‘v2), 


where v,. = Y'g is the terminal velocity. Show that the displace- 
ment to maximum r,, is given by 


Vo ValNa Vo) 


Vie =v. lop vay) = ees 
0 *%a 


Show also that 


* 


Ea -o ! 
m = Nig ve: Vay ts 


| 


Ke Vive A 


tad 
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(3.2) 


(3.3) 


where x,, is the horizontal coordinate of the maximum, and x.,, is 
the distance from the initial position to the vertical asymptote. Note 
that this relation implies that the greater | v, | or y, the more blunted 
the trajectory. 

A natural unit of time for a parabolic trajectory is the time of flight 
T = — v,:g'' to its vertical maximum. The ratio of T to the relax- 
ation time y ' for resisted motion is a dimensionless parameter 
which completely determines the relative shapes of trajectories for 
resisted and non-resisted motion. A convenient parameter deter- 
mining the size of the trajectories is the horizontal distance X to the 
vertical maximum in the parabolic case. To make the comparison 
quantitative, derive the exact relations 


yee 
- =F 
oa a 
Set 


where x, and x,, are as defined in the preceding exercise. Let x, 
denote the coordinate of the point where the trajectory crosses the 
horizontal. For ¢, < y~’, derive the approximate relation 


ko = 7). 


For the accurate curves in Figure 3.1, check the given values for y7. 
locate the asymptotes and compare horizontal crossing points with 
computed values of x,. 

An equation with general form 


vt+tyv=g, 


where g = g(t) and scalar y = y(t) are specified functions of f, is said 
to be a linear first order differential equation with scalar coefficient. 
A function A = A(t) is an integrating factor of this equation if, when 
multiplied by A, the equation can be put in the form 


d 
— (Av) = dg. 
qr AY) = 48 
Show that the integrating factor is determined by 


and the solution is given by 


v=A(0) | A(s)g(s) ds + A*(t)AgVo. 


140 Mechanics of a Single Particle 


3-4. Constant Force with Quadratic Drag 


As a rule, resistance of the atmosphere to motion of a projectile is more 
accurately described by a quadratic function of the velocity than by the linear 
function considered in the last section. In this case, the resistive force has the 
form —mavv, where v = | v| and @ is a positive constant. For a particle 
subject to a constant force and quadratic drag, the equation of motion can be 
written. 


Vv = g - avy. (4.1) 


To solve this equation, we must resort to approximation methods. By such 
methods we can solve any differential equation to the degree of accuracy 
required by a given problem. However, one approximation method may be 
easier to apply than another, or it may yield results in a more useful form. 

Before getting involved in details of a calculation, we should find out what 
we can about general features of the motion. Our analysis of motion with 
linear drag provides a valuable qualitative guide to the quadratic case. 
Indeed, if the coefficient av in (4.1) is replaced by some estimate y of its 
average value on the trajectory, the exact solution with linear drag provides a 
good quantitative approximation to the motion in limited time intervals. 
About the motion over unlimited time intervals, we can draw the following 
general conclusions: 

(1) The particle eventually “forgets” its initial velocity and reaches a 
terminal velocity 


v.=¢@ (£)" | (4.2) 


This is obtained from (4.1) by neglecting v. 

(2) As in Figure 3.1, trajectories with different initial velocities have 
different shapes. The greater the initial speed. the more blunted the trajec- 
tory and the less symmetrical its shape. 

(3) There will be a maximum horizontal displacement which, by (3.5). 
cannot exceed (av..)7' | Sav, | 

(4) The maximum horizontal range for given initial speed u, occurs for a 
firing angle 6 < 45°, and @ decreases as uv, increases. 


Horizontal and Vertical Components 


The constant vector g in (4.1) determines a preferred direction which will 
naturally be reflected in the solution. For this reason, it is of some interest to 
decompose the motion into horizontal and vertical components. We write 


év=8v+ gav=u,+ iv, , (4.3) 


where u = @-v is the vertical component of velocity and v, is the horizontal 
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component. The orientation of the unit bivector i in (4.3) is fixed by choosing 
V, positive at some time. The horizontal and vertical coordinates of position 
are similarly defined by 


&x =x, +ix,. (4.4) 


Equation (4.1) can be decomposed into components by separating scalar and 
bivector parts after multiplying by g. The scalar part gives the equation 


a 
dr 


Seo (Sv) = saul, 


or 
Uy = §— avy. (4.5a) 


The bivector part yields 


fav = a (gAv) = — augav 
= ere 2 = — OU al 
Since i? = — 1 implies that <i di/dr>,, = 0, we conclude that 
v, =-avv,, (4.5b) 
and 
di 
—= 0. : 
ai (4.5c) 


In view of (4.3) and (4.4), we see that (4.5c) means that the entire trajectory 
lies in a vertical plane. 

The differential equations (4.5a, b) are not as simple as they look, because 
they are coupled by the condition v = uj + vi. However, along a fairly 
horizontal trajectory, the condition v, >> uv, is satisfied, and the equations 
can be approximated by 


v, =-av, (4.6a) 
v) = g— QU ,Y. (4.6b) 


These equations can be solved exactly (Exercise 4.2). 


Perturbation Theory 


We turn now to a different method for getting approximate solutions to the 
equation of motion. To estimate the deviation from a parabolic trajectory due 
to drag, we write the displacement vector in the form 


r(t) = + gt? + vot + s(t), (4.7a) 
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and we impose the initial condition 
s(0) = 0, (4.7b) 


so s = s(t) describes the deviation as a function of time. Differentiating, we 
have 


v=rf=gt+v,t+s, (4.8a) 
subject to the initial condition 
s(0) = 0. (4.8b) 
Substituting (4.8a) into (4.1), we get an exact equation for the deviation 
$=-algett+v,+s|(gt+v, +8). (4.9) 


Over a time interval in which | gt + v, | >>| 8 |, Equation (4.9) is well 
approximated by 


§=-a|gt+v,| (gt+y,). (4.10) 


This equation can be integrated directly and exactly, but the result is un- 
wieldy, so we will be satisfied with solutions which meet the condition 


| gt + vo | = u(1 + 2vo-gt + v,297t?)'” 


= u,|1+ v,'"8e}. (4.11) 


In fact, this relation is exact if gAv, = (0, and it is an excellent approximation 
as long as gt < v,. Assuming (4.11), we integrate (4.10) to get 


= = uiellet tases |v, | Otay, |), (4.12) 


Integrating once more, we get 


s=-—au ane ve: ue + Vv ey |: hie (4.13) 
0 a A ee 12 0 7 os 6 : : 
Substituting this result in (4.7a), we get 
=| 
r= ter (1-0 |e | 
Vt E 
tvs (1~a = + wel] (4.14) 


A graph of the trajectory can be constructed from a parabola by evaluating s 
at a few points, as indicated in Figure 4.1. It should be noted that it is the 
second term in solid brackets on the right side of (4.11) that distinguishes 
quadratic drag from linear drag, for, if it 1s neglected, our approximation 
amounts to assuming a linear drag force — au,v = — au, (gt + v,). 


Constant Force with Quadratic Drag 143 


Our calculation of (4.14) illustrates a general method of approximation 
called perturbation theory. The idea is to estimate the deviation (i.e. pertur- 
bation) from a known (i.e. unperturbed) trajectory caused by a (usually small) 
perturbing force. We have estimated the perturbation s(t) caused by quad- 

ratic drag. Our re- 

Parabolic trajectory sult (4.14) is a first 

@ order perturbative 

¥, approximation to 

the exact trajectory. 
We can get a (more 
accurate) second 
order approxima- 
tion by regarding 
(4.14) as the unper- 
turbed _ trajectory 
and calculating first 
Fig. 4.1. Drag deviation from a parabolic trajectory. The deviation order deviations 
vector s relates “simultaneous” positions on the two trajectories. from it. A more ef- 
ficient way to get 


the same result is to substitute the expression (4.12) for s into the exact 
Equation (4.9) to get an explicit expression for the time dependence of § 


which can be integrated directly. 


Polygonal Approximation 


Another general method commonly used in numerical computation of trajec- 
tories is the method of polygonal approximation. The idea 1s that any curve 
can be approximated with arbitrary accuracy by a sequence of joined line 
segments (sides of a polygon if the curve is in a plane). We proceed as follows: 
Choose a small interval of time t and consider a succession of times t, = kT, 
where k = 0, 1,2,... . The velocities at successive times are determined by 
the equation 


Via = Ta + Vx, (4.15) 


where v, = v(t,) and a, = a(t,) = v(t,). This equation is obtained by regard- 
ing the acceleration as constant in the small time interval t. The acceleration 
is determined by the equation of motion (4.1), which gives 


a, = Z-Av,V,. (4.16) 


Because this equation happens to be independent of x, = x(¢,), we can use it 
at once to find each v, from the initial velocity v,, by iterating (4.15). To find 
points on the trajectory, we use the mean value theorem from differential 
calculus, which implies that the chord of a small segment on a smooth curve is 
parallel to the tangent at the segment’s midpoint, or, in our case, 
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_ Xk+2 7 Xk 
View Te Ir 


We use this in the form 
Xp42 = Pee aP Xx; : (4.17) 


which, by iteration, enables us to find x, for even k from x, and the v,. 
Drawing a smooth curve through these points, we get the desired approxi- 
mate trajectory, as illustrated in Figure 4.2. 


Ky 42 


Fig 4.2. Polygonal approximation. 


Clearly, the perturbation method is superior to the polygonal method for 
the present problem. It requires only one iteration to achieve a useful result 
whereas, the polygonal method requires many iteration, because it must 
reproduce the curvature of the parabola as well as the effect of the pertur- 
bation. Also, the results of the perturbation method are easier to use and 
interpret, because they are in analytical instead of graphical or tabular form, 
and they provide explicit relations between perturbed and unperturbed 
solutions. 


3-4. Exercises 


(4.1) Integrate the equation v = — avv to get the exact solution 


= 
l + Ue 
How long will it take for the drag to reduce the velocity to half the 


initial velocity? How far will the particle travel in this time interval? 
(4.2) Solve Equation (4.6a, b). (It is helpful to notice that A = 1 + wat is 
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(4.3) 


(4.4) 


(4.5) 


(4.6) 


(4.7) 


an integrating factor.) Show that the horizontal displacement r, and 
the vertical displacement r, are given by 


r,= - log(1 + v,at), 


t 
ee 


log(1 + . 
4 2u,a VU, sil oat iat) 


Compare this result with Equation (4.14), and point out any advan- 
tages of one result over the other. 

Check Figure 4.1 by evaluating the vertical ‘‘shortfall” s-@ at dis- 
tinguished points on the figure. Give a qualitative argument for 
believing that there is a time at which two particles would cross the 
same horizontal plane if they could be launched along the two 
trajectories simultaneously. Why can’t you determine this time from 
the approximate expression (4.13) for s? 

Show that if v,-"g < 0, then air resistance will have the effect of 
hastening the return of a projectile to the horizontal plane from 
which it is launched by a time interval 


a 


aS 


A 


(Hint: First determine the time of flight without air resistance). 


Use the perturbation method to derive a general estimate for the 
time of flight to a specified target. 

Use the perturbation method to find the maximum height of a 
projectile trajectory. Compare with the exact result from Exercise 
4.7. 

Equation (4.1) can be integrated by separation of variables when 
gav, = 0. First show that for an initial downward velocity the 
equation can be put in the form 


du _ (1-4) 
ie pe) 


Integrate this to get 


uv, — v., tanh | gt 


; 
y= v.tanh| g -c}=», 
v 


4 


= ps 2 
> om | os 
= 


uv, — UV, tanh | 


Vv 
where c = tanh" ( a | ; 


) 


Integrate again to get the displacement 
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2 
r= ¥ tog cosh| ee | : 
g v 


es 


To express the displacement as a function of velocity, substitute 


dy Gvedpe ody) 


into the equation of motion and integrate to get 


mente to 4 
ety: 6 vw? — v2, 


Repeat the calculation for an initial upward velocity. 
Determine the maximum height of the trajectory. 


3-5. Fluid Resistance 


In the two preceding sections, we integrated the equations of motion for a 
particle subject to resistive forces linear and quadratic in the velocity. The 
solutions are of little practical value unless we know the physical circum- 
stances in which they apply and have some estimate of the numerical factors 
involved. This section is devoted to such ‘physical considerations’, but we 
deal with only one among many kinds of resistive forces. 

An object moving through a. fluid such as water or air is subject to a force 
exerted by the fluid. This force can be resolved into a resistive force, some- 
times called drag, directly opposing the motion and a component called Jift, 
orthogonal to the velocity of the object. The lift vanishes for objects which 
are sufficiently small or symmetrical, so there are many problems in which the 
drag component alone is significant. 

The analysis of drag, not to mention lift, is a complex problem in fluid 
dynamics which is even today under intensive study. Our aim in this section is 
only to summarize some general results pertinent to particle mechanics. In 
particular, we are interested in specific expressions for the drag along with a 
rough idea of their physical basis and range of applicability. 

Since the drag D always opposes motion through a fluid, it can be written in 
the general form 


where V is the ambient velocity. The ambient velocity of an object is its 
velocity relative to the undisturbed fluid. If v is the velocity of the object and 
u is the velocity of the fluid relative to some fixed object or frame, then the 
ambient velocity is 


V=v-u. (2) 
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For example, v might be the velocity of a projectile relative to the earth while 
u is the velocity of the wind. 

The magnitude D = | D | of the drag is commonly written in the standard 
form 


D==+ CpAV-. (5.3) 


where o is the mass density of the fluid, A is the cross-sectional area of the 
object across the line of motion, V = | V | is the ambient speed, and the drag 
coefficient C,, is a dimensionless quantity measuring the relative strength of 
the drag. The value of C, depends on the size, shape and speed of the object 
in relation to properties of the fluid. It is advantageous to express the ambient 
speed as a dimensionless variable 7% called the Reynolds number, because the 
functional dependence of C, on 7% is the same for all fluids, notably, for the 
two most common fluids on earth, water and air. 

For a sphere with diameter 2a the Reynolds number is defined by the 
expression 

R= ee (5.4) 
n 

where 7 1s the viscosity and, as before, o is the mass density of the fluid. The 
viscosity, like the mass density, is determined empirically; it would take us 
too far afield to explain how. For small %, the drag coefficient can easily be 
determined from hydrodynamic theory, with the result 


C= = . (5.5) 
When this 1s substituted in (5.3) and (5.4) is used, the drag assumes the form 
= —6snaV. (5.6) 

(ee This famous result is known 


as Stokes’ Law in honor of 
the man who first derived it. 
Figure 5.1 shows __ that 
Stokes’ Law agrees well with 
experiment for A221. 
Another condition for its 
validity is that the sphere’s 
radius a be large compared 
with the mean free path of 
molecules in the fluid, which 
for air under standard condi- 
tions is of order 10° cm. The 
10? ion 10" 10)! 10° mean free path is the aver- 


10! 


10° 


Fig. 5.1. Drag coefficient for a sphere at small Reynolds age distance a molecule 
number [Redrawn from Batchelor (1967)]. travels between collisions. 
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Note that Stokes’ Law (5.6) is the linear drag law assumed tn Section 3-3, 
not the quadratic drag assumed in Section 3-4. From the data in Table 5.1 we 
can estimate the range of size and speed for which Stokes’ Law applies. For a 
sphere moving through air, the condition # = 1 implies 


2aV >(4.] = 0.15 cm? sec”. (5.7) 
O air 

For motion through water this number is reduced by a factor of 15. Thus. 
Stokes’ Law applies only to quite small objects at low velocities, such as one 
encounters in the sedimentation of silt in steams or pollutants in the atmos- 
phere. This gives us some idea of the domain in which the linear resistive 
force studied in Section 3-3 can be expected to lead to quantitatively accurate 
results. 


TABLE 5.1. Density and viscosity of water and air at a temperature of 20 °C and a 
pressure of | atmosphere. 


For water. 


3 


o= 1.00 ¢g em” = 10° kgm. 


~ 


7 = 100 10% dyn sec cm 2 
For air, 

o = 1.20 x 10° gcm™ = 1.20 kg m°. 
= 1.81 x 10* dyn sec cm™~. 


Specific viscosity: 


(2) = 0.15 em* sec! = 15 (2) 
2 air Q water 


To see how velocities compare in magnitude to Reynolds numbers for 
projectiles, consider a sphere of radius a = 1 cm moving through air. From 
Table 5.1, we find 


for V = 1 msec’, R= 1.3 X 10°; (5.8) 
for V = 330 mea. K= 44 x10. 


According to Figure 5.2, for velocities in this range the drag coefficient is 
nearly constant with the value 


Ch=y. (5.9) 


The upper limit of 330 m/sec” is the speed of sound in air. It follows that for 
spherical projectiles with speeds less than sound, the drag is fairly well 
represented by 
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D = (0.94 kg m”) @’V?. (5.10) 


Figure 5.2 shows that there is a pronounced decrease in the value of the drag 
coefficient in the vicinity of the speed of sound. A well-struck golf ball takes 
advantage of this. The dimples on a golf ball also reduce drag by disrupting 
the boundary layer of air that tends to form on the balls surface. Besides this, 
the trajectory of a golf ball will be affected by its spin which is usually large 
enough to produce a significant lift. 


3 


UNS) 


0 
1 10 102 10° 104 10° 10° 


R 
Fig. 5.2. Experimental values for the drag coefficient of a sphere for 
10 < #< 10’ [Data from Batchelor (1967)]. 


For very high velocities the Mach number is more significant than the 
Reynolds number. The Mach number is the ratio of the ambient speed to the 
speed of sound. Figure 5.3 shows that, for velocities well above the speed of 
sound, we have 


Ca". (Sei) 
so in this regime the drag (5.3) is again a quadratic function of the ambient 
velocity. 


A crude qualitative understanding of fluid drag can be achieved by regard- 
ing it as a composite of two effects, viscous drag and pressure drag. Pure 
viscous drag is described by Stokes’ Law (5.6). It is due to fluid friction as the 
fluid flows smoothly over the surface of the object, so, as (5.6) shows, viscous 
drag is proportional to the circumference of the sphere. 

As the ambient velocity of the sphere increases, the pressure differential 
between front and back increases as well, giving rise to pressure drag. Pure 
pressure drag is characterized by (5.3) when C,, is constant. The form of the 
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Fig. 5.3. Ballistic range measurements of drag on spheres and cone- 
cylinders as a function of Mach number (Charters and Thomas (1945), 
Hodges (1957), Stevens (1950)). [from R. N. Cox and L. F. Crabtree, 
Elements of Hypersonic Aerodynamics, Academic Press, N. Y. (1965), p. 
42, with permission]. 


equation can be interpreted by noting that the rate of collision with molecules 
in the fluid is proportional to AV, while the average momentum transfer per 
collision introduces another factor of V. Deviations from pressure drag arise 
from viscous effects in the fluid that piles up in front of the moving object. 

The data introduced above apply only to objects which are large compared 
to the mean free path of molecules in the fluid. In Chapter 8 we will be 
concerned with the effect of atmospheric drag on artificial satellites. The 
dimensions of such a satellite are small compared with the mean free path of 
molecules in the outer atmosphere, (which at 300 km above the surface of the 
earth is about 10 km.) In this case fluid mechanics does not apply, and the 
collisions of individual molecules with the satellite can be regarded as inde- 
pendent of one another. The result is pure pressure drag with 


C, ~2. (5.12) 


For nonspherical artificial satellites the average of the lift force over changing 
orientations will tend to be zero, and an average drag can be employed, but 
the result will be subject to statistical uncertainties that render a refined 
dynamical analysis futile. The average drag will differ from that given by (5.3) 
only in the interpretation of A as the average cross-sectional area. The 
cross-sectional area of any convex body averaged over all orientations can be 
shown to be equal to one-fourth its surface area. A ‘“‘convex’’ body is one 
whose surface intersects a straight line no more than twice. 
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Section 3-5. 


(5.1) 


(5.2) 
(5.3) 
(5.4) 


(5.5) 


Find the terminal velocity of a man m = 73 kg falling through the 

atmosphere for two extreme cases: 

(a) He spread-eagles, producing a cross-sectional area of 0.6 m’, in 
the direction of motion. 

(b) He tucks to produce a cross-sectional area of only 0.3 m’. 

What terminal velocity do you get for the first case using Stokes’ 

Law? 

What parachute diameter is required to reduce the terminal velocity 

of a parachutist (m = 73 kg) to 5 m sec” 

What is the terminal velocity of a raindrop with (typical) radius of 

10° m? What result would you get from Stokes’ Law? 

A mortar has a maximum range of 2000 m at sea level. What would 

be the maximum range in the absence of air resistance? 

Two iron balls of weights | kg and 100 kg are dropped simul- 

taneously and fall through a distance 100 m at sea level. Which ball 

hits the ground first and how far behind does the other lag? What is 

the difference between arrival times of the two balls? 


3-6. Uniform Magnetic Field 


The classical equation of motion for a particle with charge g in a magnetic 


field B is 


Let us lump the constants together by writing 


wo =- RB, (6.2) 


Mc 


so Equation (6.1) takes the form 


V=oXv. (6.3) 
It is convenient to introduce the bivector Q dual to @ by writing 
Q = iw. (6.4) 


Then, since 
@Xv=—iwav =-(io)v=v2 
Equation (6.3) can be written in the alternative form 


v=even. 


(6.5) 
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Without solving this equation, we can conclude from it that | v_ is a constant 
of the motion by an argument already made in Section 2-7. Since the 
magnitude of v is constant, the solution of (6.5) is a rotating velocity vector. 
The solution will show that 2 can be regarded as the angular velocity of this 
rotation. 

Equation (6.5) can be solved by generalizing the method of integrating 
factors used in Section 3-3. This is made evident by writing (6.5) in the form 


¥+3AQv+v(-+Q) =0. (6.6) 
Suppose we can find a function R = R(t) with the property that 

R=R-;Q. (6.7a) 
Since Qt = — Q the reverse of (6.7a) is the equation 

Rt =-FQRt. (6.7b) 


Now multiplying (6.6) on the left by R and on the right by Rt and using (6.7a, 
b) we put it in the form 


(RvR?) = 0. (6.8) 


Thus we see that R and R’¢ are integrating factors for (6.6). Two integrating 
factors instead of one are needed, because 22 does not commute with v. We 
say that R is a left integrating factor while Rt is a right integrating factor for the 
equation. 

Our method of integrating factors has replaced the problem of solving the 
equation for v by the problem of solving the differential equation (6.7a) for 
the integrating factor. This is a significant simplification when 2 is constant, 
for then (6.7a) will be recognized as the derivative of the exponential function 


eee, (6.9) 


It follows that 


Rt = ema’ = p-var 


Moreover, 

R(0) = R1(0) = 1, (6.10) 
and 

R'R = RR*=1. (6.11) 


With the initial conditions (6.10) for the integrating factors, Equation (6.8) 
integrates immediately to 


RvR*t— y, = 0, 


and, using (6.11) to solve for v, we get 
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VR Re iggy ARIE’, (6.12) 


By squaring both sides of this equation, it is readily checked that the solution 
has the property v’ = v,, so the magnitude of v is constant, as we had 
anticipated. 

We can combine the two exponential factors in (6.12) into a single term by 
decomposing v, into components that commute or anticommute with Q. Let 
v,, be the component of vy, parallel to the magnetic field while v,, is the 
component perpendicular to the magnetic field, as indicated in Figure 6.1. 
Algebraically, the decomposition of v at 
any time fis expressed by the equations 


VY =i Vii 


where 
OV, =OV=V\o, (6.13a) 
OV, =WAV=-V,O, (6.13b) 


or, alternatively 
Qv, = Qav=v,2, (6.14a) 
Fig. 6.1. Qv = 2-v=—=v, 2. (6.14b) 


Using (6.14a, b) and the series definition of the exponential function, it is 
readily established that 


Rivet Vieume° =v Re (6.1Sa) 

Rive Veg = Vy CO = Voy R. (6.15b) 
Now, with (6.15a, b) and (6.11), we can put (6.12) in the form 

V=V,, RR + va= Youme” + Voy (6.16) 


Here we see that v, = v,, is fixed while v, = v,, e” rotates through an angle 
Qt in time ft; so the resultant velocity vector v sweeps Out a portion of a cone, 
as indicated in Figure 6.1. 

We find the particle trajectory by substituting v = x into (6.16) and in- 
tegrating directly, with the result 


K— Xo = Voustete — 1) + vot: 


The form of the solution can be simplified by an appropriate choice of origin. 
Introducing the variable 


r=x—-x, + Vv, i =x—-x,+v,x a2" (6.17) 
the solution can be cast in the equivalent forms 


r=vyQle™+vt=VvyXo'e"+ vo ‘ot. (6.18) 
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This is a parametric equation for a helix with radius 


a=v,Xo", (6.19a) 
and pitch 

b=v,o". (6.19b) 
Adopting the angle of rotation 

g=|a/t (6.19c) 


as parameter, and writing 6 = 6@, Equation (6.18) can be put in the “‘stan- 
dard form” for a helix 


r(@) = ae’ + 50 (6.20) 


with it understood that a-@ = 0. The helix is said to be right-handed it b > 0 
and left-handed if b < 0. (See Figure 6.2a, b). 


Fig. 6.2a. Righthanded Helix. Fig. 62b. Lefthanded Helix. 


The trajectory (6.18) reduces to a circle when v,, = 0. The radius vector r 
rotates with an angular speed | w | = | gB |/mc called the cyclotron fre- 
quency. According to (6.2), w has the same direction as the magnetic field B 
when the charge q is negative and the 


B 
opposite direction when the charge ts Ww 
positive. As shown in Figure 6.3, the 
circular motion of a negative charge is 
right-handed relative to B while that of 
a positive charge is left-handed. 
Ww 


The Effect of Linear Drag a a0 


: : Fig. 6.3. 
Now let us see how the motion just - 


considered is modified by a linear resistive force. We seek solutions of the 
equation 
Vv+2-v+yv=0, (6.21) 
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where y is a positive constant. With all quantities defined as before, by 
introducing appropriate integrating factors, we put (6.21) in the form 


d 
—— (ert y= 
7 (e"RvR') = 0. 
Integrating and solving for v = v(t), we get 


v=e"RV,R =e%(v,, e+ v,)). (6.22) 


Writing v = x and integrating once more, we find that the equation for the 
trajectory has the same form as (6.22), specifically, 


r=¢ (se +b), (623) 
where 

BS Mio (2 a Vie ’ 

b= Vo vs ’ 


r=x-x,+atb. 
Equation (6.23) describes a particle spiraling to rest at r = 0). The displace- 
ment in time t > y' is given by 
You Qty) " Yor | 
Ruy y 
The trajectory lies in a plane if v,, = 0. In this case, (6.22) and (6.23) imply 
the following simple relation between velocity and radius vector: 


x(%) —x, =-a-b= 


wr' = Q-Yy. 
The angle @ between v and r ts therefore given by 
— lvar| _|2 
tan @ cea r 


Since this angle is the same at all points of the trajectory, the curve is 
commonly called an equiangular spiral. 

The spiral tracks of electrons in a bubble chamber shown in Figure 6.4 are 
not accurately described as equiangular, because the retarding force is not 
linear in the velocity and velocities of interest are high enough for relativistic 
effects to be significant. 


3-7. Uniform Electric and Magnetic Fields 


In this section we develop a general method for solving the equation of 
motion of a charged particle in uniform electric and magnetic fields. The 
equation of motion is 
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Fig. 6.4. An electron loses kinetic energy by emitting a photon (light quantum), as shown by 
the sudden increase in the curvature of its trajectory (track) in a propane bubble chamber. The 
emitted photon creates an electron-position pair. The curvature of an electron trajectory in a 
magnetic field perpendicular to the photograph increases as the electron loses energy by 
collisions. Since electrons and positrons have the same mass but opposite charges, their trajec- 
tories in a magnetic field curve in opposite directions. The smaller curvature of the positron was 
created with more kinetic energy than the electron. 


mv = gE i B}. (721) 
\ c 
As before, we suppress the constants by writing 


ge =k, (7.2a) 
m 
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’ iqB 
Q=io=--“ . (7.2b) 
mc 
so the equation of motion takes the form 
V=gtvQ=gt+oxy. (7.3) 


For uniform fields, g = g(t) and 2 = Q(t) are functions of time but not of 
position. 

We aim to solve Equation (7.3) by the method of integrating factors. As in 
the preceding section, we introduce an integrating factor R = R(t) defined by 
the equation 


Re (7.4a) 
and the initial condition 
R(0) = 1. (7.4b) 


This enables us to put Equation (7.3) in the form 
d 
— (RvRt) = RgR’. 
gy RVR") = Re 
Integrating and solving for v, we get the general solution 
t 
v=Ri{v,+ i RgRt dt} R, (7.5) 
0 
or, with the arguments made explicit 


Olas [! R(s)g(s)R'(s) ds}R()}. 


After this has been used to evaluate v as an explicit function of ¢, the 
trajectory is found directly by integrating x = v(¢). 

Now let us examine some specific solutions. For constant 2 we know that 
(7.4) has the solution 


R= e(/2)Mr = el2yier (7.6) 


In this case, we can simplify the integral in (7.5) with a procedure we have 
used before. We decompose g into components parallel and perpendicular to 
w, which have the algebraic properties 


g, 2 = gaQ= Qzg,, (7.7a) 

g,2=_:2=-2¢2,; (7.7b) 
whence, 

Rg, = 8, R, (7.7c) 


Rg.=g,R’. (7.74) 


158 Mechanics of a Single Particle 


So the integral can be put in the form 


t 
[Rent a= | ge dtt+ [eae (7.8) 
0 0 0 


ready to be integrated when the functional form of g is specified. 
When g is constant, the integral (7.8) has the specific value 


t 
| RgR' dt = g, 2" (l-e™) + g/t. 
0 


Inserting this into (7.5) and using (7.7), we get the velocity in the form 
v=ae™ +bt+c, (7.9) 


where the coefficients are given by 


a=v,, +g2° =v, +2gXo'=Vv,,-cEXB", (7.10a) 

b=g,=g:0'o =+R,, (7.10b) 
m 

C=V,,—- 2:2! =Vv,,-g X w' = Vv, + cE XB’. (7.10c) 


Since a lies in the 82 — plane and b is orthogonal to that plane, Equation (7.9) 
is a parametric equation for a helix with radius | a |. 
By integrating (7.9) we find the trajectory 


r=aQ' e+ br + cet, (7.11) 


where an appropriate choice of origin has been made by writing 


P= <x, + ase. 


Equation (7.11) is best inter- Path of 
preted by regarding it as a guiding center 
composite of two motions. 
First, a parabolic motion of 
the guiding center described 
by the equation 


r,=sbt?+ ct. (7.12a) 


trajectory 


Second a uniform circular 
motion about the guiding 


center described b : 
y Fig. 7.1. Trajectory of a charged particle in uniform 


bee e (7.12b) electric and magnetic fields. 


The composite motion r = r, + r, can be visualized as the motion of a point 
on a spinning disk traversing a parabola with its axis aligned along the 
vertical, as illustrated in Figure 7.1. Corresponding directions of electric and 
magnetic fields are shown in Figure 7.2. 
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Fig. 7.2. 


wWwxrr 


(c) 
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Motion about the guiding center av- 
erages to zero over a time period of 
2n| 2 |! = 22mc| gB |"', so motion of 
the guiding center can be regarded as 
an average motion of the particle. Ac- 
cordingly, the velocity of the guiding 
center is called the drift velocity of the 
particle. 


Motion in Orthogonal Electric and 
Magnetic Fields 


The special case of motion in orthog- 
onal electric and magnetic 
fields is of particular inter- 
est. From (7.10b) we see 
that if E-B = 0, then b = 0, 
so according to (7.11) or 
(7.12), the parabolic path of 
the guiding center reduces to 
a Straight line. If, in addi- 
tion, the initial velocity is 
orthogonal to the magnetic 
field, then, by (7.10c) and 
(7.12a), the drift velocity is 


r,=c=o'Xg 
=cEXB". (7.13) 


Surprisingly, the drift vel- 
ocity in this case is orthog- 
onal to the electric as well as 
the magnetic field. The par- 
ticle trajectory can be vis- 
ualized as the path of a point 
on a disk spinning with an- 
gular speed w = —q |B |/ 
mc as it moves in the plane 
of the disk with constant 
Speed) ¢|=c)EECB’ | =c 
| E |/| B |. The Greeks long 


le, Tote 
electric and magnetic fields. 


Trajectories of a charged particle in orthogonal 


ago gave the name trochoid 
(trochas = wheel) to curves 
of this kind. The trochoids 
fall into three classes charac- 
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terized by the conditions |@ Xa|=|c|,|@xal<|e|,|oxal|>|cl, 
with respective curves illustrated by Figures 7.3a, b, c. As the figures suggest, 
the three curves can also be interpreted as the paths of particles rigidly 
attached to a wheel which is rolling without slipping, the three conditions 
being that the particle is attached on the rim, inside the rim or outside the 
rim, respectively. 

According to (7.11) or (7.12), the particle motion coincides with the 
guiding center motion when a = 0, which, according to (7.10a), occurs when 


VY=w'Xg=cEXB". (7.14) 


The trajectory is a straight line if E-B = 0. This suggests a practical way to 
construct a velocity filter for charged particles. Only a particle which satisfies 
the condition (7.14) will continue undeflected along its initial line of motion. 
The E and B fields can be adjusted to select any velocity in a wide range. The 
selection is independent of the sign of the charge or the mass of the particle. 


The Hall Effect 


The above results can be used to analyze the effect of an external magnetic 
field on an electric current in a conductor. Suppose we have a conductor with 
a current / immersed in a uniform magnetic field B. As Figure 7.4 shows, 
whether the current is due to a flow of positive charges or negative charges, 
the charge carriers in both cases will be deflected to the right by the magnetic 
force F, = g.c'v. X B. Consequently, carriers will accumulate on the right 
wall of the conductor until they produce an electric field E,, sufficient to cancel 
the magnetic force exactly. The condition (7.14) for a velocity filter has then 
been met and charges flow undeflected in the conductor with speed 


v=cE,B". 


The electric field E,, is manifested by a readily 
measured potential difference 


between the right and left sides of the conductor. 
The appearance of such a transverse potential 
difference induced in a conductor by a magnetic 
field is known as the Hall Effect. 

The sign of the Hall potential @,, indicates the 
sign of the charge on the carriers. The magnitude 
of @, permits an estimate of the density N of Fig. 7.4. The Hall effect. 
charge carriers. The current density is given by 
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where A is the cross-sectional area of the conductor and q is the unit of 
charge. Hence, 


qu Gea, = ge, 


The quantities on the right are known or readily measured. The Hall effect 
has been a valuable aid in the study of semiconductors, such as silicon and 
germanium, in which N is small compared with its value in metals, for which 
the Hall potential is correspondingly large and easily measured. 

We can estimate the magnitude of the Hall potential by taking resistance 
into account. Suppose, as expected from Ohm’s law, that the resistive force 
exerted by the medium on the charge carriers is linear. The equation of 
motion (7.3) is then generalized to 


V=gtvQ-yv. . (7.15) 


The condition v-B = 0 implies that v-2 = v®, so for terminal velocity 
Equation (7.15) gives 


Me) =e, 


Or 


+ 22 
=e - See 7.16 
v=ey-9)* = (2 2 | (7.16) 
From this we can read off immediately that the direction of v differs from that 
of g by the angle tan ' (y’ | 2 |). In the case of the Hall effect, the electric 
field E has a component E, collinear with v and component E,, orthogonal to 
it, SO 
q 


g-+2=+@,+E,). 
m m 


The condition that components orthogonal to v cancel in (7.16) entails 
E,y + E,2 = 0, 


whence 


1 ee. 


By substituting this back into (7.16), it can be verified that v = qE,/my 
accordance with Ohm's law. Equation (7.17) can be used for experimental 
determination of the resistive coefficient y, and for some metals (eg. Bismuth) 
the resistance is found to depend strongly on the magnetic field and its 
orientation of crystal axes in the conductor, revealing another valuable 
application of the Hall effect. 
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Cyclotron Resonance 


As another application of our general method, let us study the motion of a 
charged particle in a constant magnetic field subject to a periodic electric field 
E = E, cos w't as well. To find the velocity of the particle, we substitute 


g=m'qE=g,cosw't (7.18) 


into (7.8) and evaluate the integrals 


t sin w't 
dt cos w't = ———— 
0 Ww 


t t 
iw 1 (co? — eo ites’ to 
[fare cos w'e= f dr= (e ia F eee ay 
0 0 
1 eller’ wy enlw + wt) 
— 2i\ 0'-wo w' +o 


ie 


mee ( w’ sin w't—iw cos w't | 
wo” -— a , 
where we have written-the “cyclotron frequency of the magnetic field in the 
form 2 = iw = iw for the convenience of operating with the unit bivector i. 
With the integral evaluated, Equation (7.5) gives us 


= @’ sin w't—i@ cos ot 
v= eo (2 ier Vo cits Big em | 3 : 
wo’ —w 


sin w’t 

(1/2 )ier 

el  —— je R 
w 


which simplifies to 


= + Voy elm Vie (7-19) 


Oo =i 


= 2 sin w't — iw cos at) sin w't 
\y C60 5 | eee ie mn (| a 
This can be integrated to get the trajectory, but the feature of greatest interest 
is apparent right here, namely the fact that the first term on the right side of 
(7.19) is infinite when w’ = w. This implies the existence of a resonance when 
the “driving frequency” w’ is in the neighborhood of the cyclotron frequency 
wo. As we Shall see later, dissipative effects must be taken into account near 
resonance, with the consequence that the infinity in (7.19) is averted, but the 
velocity v has a maximum magnitude at resonance. At the same time the 
terms depending on v, in (7.19) are ‘damped out”. The chief significance of 
all this is that at resonance the particle extracts energy from the driving 
electric field at a maximum rate, energy which is lost in collisions or by 
reradiation. 


Uniform Electric and Magnetic Fields 163 


These considerations help explain attenuation of radio waves in the iono- 
sphere. The driving field (7.18) can be attributed to a plane polarized 
electromagnetic wave with frequency w'/2z. For free electrons in the iono- 
sphere the earth’s magnetic field strength of 5 x 10° gauss yields a cyclotron 
frequency of magnitude w = ~ qB/mc = 8.5 X 10°. So we expect resonant 
absorption of electromagnetic waves with frequency near w/2a = 1400 kHz. 
The ionosphere does indeed exhibit a marked absorption of radio waves in 
that frequency range. 

The same considerations are important for describing the propagation of 
electromagnetic waves in the vicinity of stars and through plasmas generated 
in the laboratory. 

We have not considered problems with time-varying magnetic fields, but it 
should be pointed out that, for uniform magnetic fields with fixed direction 
but time-dependent magnitude, the solution of (7.4) has the general form 


R= Jones = ad out (7.20) 
which can be evaluated by straightforward integration. Solutions of (7.4) 
when the direction of £2 is time-dependent will be obtained later. 


3-7. Exercises 


(721) Solve Equation (7.3) for constant fields by the method of undeter- 
mined coefficients. Since the equation is linear, we expect the 
solution to be expressible as a sum of solutions for the special cases 
g = 0 and Q = 0 which we found earlier. Therefore we expect a 
solution of the form 


v= ae" + bt+c. 


Verify this and evaluate the coefficients by substitution in (7.3) and 
imposition of initial conditions. 

Ci) A charged particle in constant, orthogonal electric and magnetic 
fields is at rest at the origin at time t = 0. Determine the time and 
place of its next ‘rest stop”’. 

(723) Show that the parametric equation (7.11) for the trajectory in 
constant fields can be put in the form 


r=vt+see+ 


+ox|(™ a) + (eee, | 4 


l-yo't — cos wf +t? — cos wt sin wt — wt 
+ w@ X) Ox = ee =a , 
a. w 
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showing the deviation from the parabolic trajectory obtained when 
w = (0. Evaluate the deviation to fourth order in w = | @ |. 

(7.4) Use the method of integrating factors to solve the equation of 
motion 


V=givQ-yv, 


for a charged particle in constant electric and magnetic fields subject 
to a linear resistive force. Verify that the trajectory is given by 


r=(e72 Py, )Z eo ee 


(e"-1+ y) (1 -e”) 
|= e aae comme oe 
if Me 


ae 
where Z = Q - y, the parallel and perpendicular components of the 
vectors are defined as in Equation (7.7), and r is related to the 
position x by 


retexkyet (gy Z cee alae 


Draw a rough sketch and describe the solution, taking the last three 
terms to describe the guiding center. 


3-8. Linear Binding Force 


The binding of an electron to an atom or of an atom to a molecule, these are 
examples of the ubiquitous and general phenomenon of binding. To under- 
stand this phenomenon, we need an equally general mathematical theory of 
binding forces and the motion of bound particles. 

A binding force is understood to be a force which tends to confine a particle 
to some finite region of space. The force function of a bound particle must be 
a function of position, and we can develop a theory of binding without 
considering velocity-dependent forces. Let us determine some general proper- 
ties of binding forces. A point x, is said to be an equilibrium point of a force 
f = f(x) if 


f(x,) = 0. (8.1) 


The equilibrium point x, is said to be isolated if there is some neighborhood of 
x, which does not contain any other equilibrium points. Let us focus attention 
on such a point and such a neighborhood. 

Near an equilibrium point x,, a simple approximation to any force tunction 
is obtained by using the Taylor expansion (as explained in Section 2-8) 


f(x) = f(x, + r) = f(x,) + r-Vf(x,) +$(r-V) f(x) +... . (8.2) 


The equation of motion mx = f for a bound particle is then given in terms of 
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the displacement from equilibrium r = x — x, by 


mx = mt = L(r) + Q(r) +... (8.3) 


where L(r) = r-Vf(x,) and Q(r) = } (r-V)? f(x,). The force function L(r) is 
linear, that is, it has the property 


E(t ob.) — oO, Lin, on (6s), (8.4) 


where q@, and a, are constant scalars. The force function Q(r) is a quadratic 
vector function of r. Whatever the exact form of f(x), if L(r) # 0, there is a 
(sufficiently small) neighborhood of x,, defined roughly by the condition 
| L(r) | >> | Q(r) |, in which the equation 


mr = L(r) (8.5) 


provides an accurate description of particle motion. The particle will remain 
in a neighborhood of the equilibrium point if r-f < 0, which is to say that the 
particle's acceleration must be directed toward the interior of the neighbor- 
hood. This can be expressed as a binding property of the force by using (8.5); 
thus, 


r-L(r) sa0, (8.6) 


with equality only if r = 0. This relation is also called a stability condition, 
because it is a necessary condition for the particle to remain bound. 

A particle subject to a linear binding force L(r) is called a harmonic 
oscillator, because its motion is similar to vibrations in musical instruments. 
Indeed, if a plucked violin string is represented as a system of particles, the 
harmonic oscillator provides a good description of each particle. The quad- 
ratic force Q(r) and higher order terms in the Taylor expansion of the force 
are called anharmonic perturbations, and when they are included in its 
description, the particle is said to be an anharmonic oscillator. The most 
significant difference between harmonic and anharmonic oscillations is that 
the former obey a linear superposition principle; specifically, if r, = r,(¢) and 
r, = r,(t) are solutions of the equation of motion (8.5), then, as a conse- 
quence of (8.4), so is the “linear superposition” r, = ar, + a.,. This 
superposition principle makes the analysis of harmonic motions easy, and the 
lack of any such general principle makes the analysis of anharmonic motions 
difficult. Anharmonic motions are most easily analyzed by perturbation 
theory as small deviations from harmonic motion. 


The Isotropic Oscillator 


A particle subject to an isotropic binding force is called an isotropic oscillator. 
In this case L(r) has the simple form 


L(r) =—kr, (8.7) 
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where & is a constant positive scalar called the force constant. Equation (8.7) 
will be recognized as Hooke’s Law, but the term “‘Hooke’s Law” is commonly 
used to refer to any linear binding force. Obviously, Hooke’s Law is so 
ubiquitous in physics, because any binding force can be approximated by a 
linear force under some conditions. By the same token, it is evident that 
Hooke’s Law is not a fundamental force law, but only a useful approximation. 

Let us now examine the motion of an isotropic oscillator. Its equation of 
motion can be put in the form 


Sei es (8.8) 
m 


Our experience with exponential functions suggests that this equation might 
have a solution of the form 
r = ae“ 


Inserting this trial solution into (8.8) and carrying out the differentiation, we 
find that the equation of motion is satisfied if and only if 
py Sa 
m 


This algebraic equation is called the characteristic equation of (8.8); it has the 
roots 


A= + io,, 
where i° = — 1 and 

=~. (8.9) 
To each of these roots there corresponds a distinct solution of (8.8), namely 

r, =a,e" = a, (cos wt + isin wf), (8.10a) 
and 

r =ae' =a (cos w,t—isin ,f), (8.10b) 


The reader will recognize that we are freely using properties of the exponen- 
tial function determined in Section 2-5. Notice that these equations imply that 
iis a bivector satisfying 


a, ai=0, (8.11) 


because r. and a, must all be vectors. This tells us the significance of the 
“imaginary roots” of the characteristic equation. The “imaginary” iis the unit 
bivector for a plane in which the solution vectors r, lie at all times. 

According to the superposition principle, we can add solutions r. andr to 
get a solution 
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r =ane’”™ tigee (8.12) 


Alternatively, this solution can be written in the form 


r = a COS Wot + by SiN Wot, (8.13) 
where 

ap ae ae, (8.14) 

b, = (a, —a)i. 


The two constant vectors a, and b, can have any values, so we know that 
(8.12) is the general solution of (8.8); thus, the orbit of any harmonic 
oscillator can be described by an equation of the form (8.13) or (8.12). Note 
that if either a, = 0 or b, = 0, then a,aa = 0 and the bivector i in (8.12) is 
not uniquely determined by the condition (8.11); however, any unit bivector i 
satisfying (8.10) will do. 

Equation (8.13) has the advantage of being directly related to initial 
conditions, for the vector coefficients are related to the initial position and 
velocity by 


r, = r(0) = a. 
(8.15) 
Vv, = F(0) = a@,by- 


Equation (8.13) represents the motion as a superposition of independent 
simple harmonic motions along the lines determined by r, and v,, and, of 
course, it reduces to one dimensional simple harmonic motion if r, = 0 or 
v, = 0 or, more generally, if r,av, = @a Ab, = 0. The resultant motion is 
elliptical. This may be more obvious if (8.13) is recast in the standard form 
(see Exercise 8.1): 


r=acos¢+ bsin ¢, (8.16a) 
where a? = b’, a:b = 0, and 
p = P(t) = at + po. (8.16b) 


The scalar constant @, in (8.16b) can be elim- 
inated, if desired, by writing @, = wf, and 
shift the zero of time by ¢,. It will be recog- 
nized that (8.16a) is a parametric equation for 
an ellipse with major axis a and minor axis b 
(Figure 8.1). 

The elliptical motion of an isotropic oscil- 
lator is periodic in time. A particle motion is 
said to be periodic if its state variables r and r 
have exactly the same values at any two times 
separated by a definite time interval T called the period of the motion. For the 


Fig. 8.1. Orbit of an isotropic har- 
monic oscillator. 
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elliptical motion (8.16), the period T is related to the natural frequency w, by 
OT = 22. (8.17) 


The motion during a single period or cycle is called an oscillation or, if it ts 
one dimensional, a vibration. The constant @, in (8.16b) is referred to as the 
phase of an oscillation beginning at f = 0. The maximum displacement from 
equilibrium during an oscillation is called the amplitude of the oscillator. For 
the elliptical motion (8.16), the amplitude is a = | a |. 

Equation (8.12) represents the ellip- 
tical motion as a superposition of two 
uniform circular motions (8.10a, b) 
with opposite senses. This is illustrated 
in Figure 8.2 for @ = w,t. As the figure 
suggests, this relation provides a prac- 
tical means for constructing an ellipse 
from two circles. 

Elliptical motion can also be repre- 
sented as a superposition of two circu- 
lar motions with the same amplitude, 
frequency and phase. As shown in Fig- Fig. g.2. Elliptical motion as a superpo- 
ure 8.3, the circular motions are in two _ sition of coplanar circular motions. - 
distinct planes intersecting along the 
major axis of the resultant ellipse. This relation is described by the equation 


r= a(e*? + e-*), (8.18) 


where i, andi are unit bivectors for the two planes. It can be shown (Exercise 
8.3) that Equations (8.18) and (8.16b) are equivalent if 


b= ya(i, +i) = Fa-(i, +i). (8.19) 


We _ will encounter 
other significant forms 
for the equation of an 
ellipse later on. 


The Anisotropic 
Oscillator 


Let us turn now to a 
brief consideration of 
the anisotropic oscil- 


lator. An anisotropic Fig. 8.3. Elliptical motion as a superposition of noncoplanar 
linear binding force circular motions with equal amplitudes. 
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L(r) is characterized by the existence of three orthogonal principal vectors a, . 
a,, a, with the properties 


L(a,) ee ka, ? 
L({a,) = ka, ’ 
L(a,) = — k,a;, (8.20) 


where k,. k, and k, are positive force constants describing the strength of the 
binding force along the three principle directions. Linear functions with the 
property (8.20) will be studied systematically in Chapter 5. Our problem, 
now, is to solve the equation of motion (8.5) subject to (8.20). The superpo- 
sition principle enables us to decompose the general motion into independent 
one dimensional motions along the three principle directions; for if r; is the 
component of displacement proportional to a;, then (8.5), (8.20) and (8.4) 


imply 

mr = mir, + mr, + mr, = — kyr, — kr, — k,r,; 
but the r; are orthogonal, so each component must independently satisfy the 
equation 

mi, = —-ky;. 


The solutions to this equation must have the same general form as those for 
the isotropic oscillator restricted to one dimensional motion. Hence, the 
general solution for the anharmonic oscillator is given by 


r =a, cos(w,t + @,) + a, cos(@,t + @,) + a, cos(w,t + ¢;), (8.21a) 
with the three natural frequencies given by 
ie (| (8.21b) 
m 


The motion described by (8.21a) will not be periodic and along a closed curve 
unless the ratios w,/w, and w,/w, are rational numbers. In general, the orbit 
will not lie in a plane, but it will lie within an ellipsoid centered at the 
equilibrium point with principal axes a,, a,, a,. If a, = 0, then the orbit will 
lie in the a,a,-plane; and it is commonly known as a Lissajous figure. 

An atom in a crystalline solid is typically subject to an anisotropic binding 
force determined by the structure of the solid. The modeling of such an atom 
as an anisotropic oscillator is one of the basic theoretical techniques of solid 
state physics. The amplitude of such an oscillator is of the order of atomic 
dimensions 10* cm, while the vibrational frequencies in solids range from 10" 
to 10'* Hz (1 Hertz = 1 cycle sec '). It has been said that quantum mechanics 
is needed to describe interactions at the atomic level. Quantum mechanics is, 
indeed, required to calculate force constants and vibrational frequencies. 
However, given force constants determined either experimentally or theor- 
etically, much can be inferred from the model of an atom as a classical 
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oscillator. The oscillator model is so useful because, as we have seen, it does 
not commit us to definite assumptions about the ‘‘true nature”’ of the binding 
force. 


Energy Conservation 


When an oscillator is used as an atomic or molecular model, the actual orbit 
of the oscillator is only of peripheral interest. Considering the small ampli- 
tudes and high frequencies of atomic vibrations, there is evidently no hope of 
directly observing the orbits. Only general features of the motion are suscep- 
tible to measurement, namely, the frequencies and amplitudes of oscillation. 
Of course, the amplitude cannot be measured directly, but measurements of a 
closely related quantity, the energy, are possible. [o determine the energy of 
an isotropic oscillator, we multiply (8.8) by r and observe that 


Bes yee 


7 fe me? + bar’) = 0. 


Hence, the quantity 
E=+mr?4+$kr (8.22) 


is a constant of the motion, that is, a function of the state variables r and r 
which is independent of time. The quantity E is called the (total) energy of the 
oscillator, and Equation (8.22) expresses it as a sum of kinetic and potential 
energies respectively. The energy is related to the amplitude by substituting 
the solution (8.16) into (8.22), with the result 


E=sk(a’ +b’). (8.23) 


Of course, energy is an important state variable even for a macroscopic 
oscillator, but for an atomic oscillator it is indispensible. 


The Damped Isotropic Oscillator 


A physical system which can be modeled as an oscillator is never isolated 
from other interactions besides the binding force. Invariably there are inter- 
actions which resist the motion of the oscillator. Such interactions can be 
accounted for, at least qualitatively, by introducing a linear resistive force — 
myr, so that the equation for the motion of an isotropic oscillator becomes 


r+yr + ar =0, (8.24) 


where w, = k/m as before. The resistive force is also called a damping force, 
because it reduces the amplitude of oscillation, or a dissipative force, because 
it dissipates the energy of the oscillator. 

To solve Equation (8.24), we substitute into it the trial solution r = ae’ 
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which worked before. After carrying out the differentiations, we find that the 
solution works if A is a root of the characteristic equation 


V+y+ou,=0. 
Thus, 
=-7y+ (Gy - a)”. (8.25) 


It is readily verified that a linear superposition of solutions to (8.24) is again a 
solution. Hence, we get the general solution by adding solutions correspond- 
ing to the two roots of (8.25), namely, 


r= el2(g ers ol gg 07 with) (8.26) 
+ . . 


Actually, we have three types of solutions corresponding to positive, negative 
and zero values of the quantity | y? — w;. Let us consider each type separately. 

(a) Light damping is defined by the condition +y < w,. In this case we 
write 


(+? i. way? = i(w; = ty)? =e (8.27a) 


for we know that the unit imaginary must be a bivector i specifying the plane 
of motion. The solution, therefore, has the form 


i lala (| ei? ate a ean) 
=e" (acos Qt + bsin Qr). (8.28b) 


This can be interpreted as an ellipse with decaying amplitude. The exponen- 
tial factor shows that in time ¢ = 2y' the amplitude will be damped by the 
significant factor e |. If sy << w, ~ Q. then 2y' >> Q', so the amplitude of 
the ellipse will be nearly constant during single period, and many periods will 
pass before its amplitude has been damped significantly. 

We have noted that, in general, the energy of an oscillator is a more 
significant state variable than the amplitude. Using the solution (8.28b) in the 
formr = e‘'~’s, we find that the energy & defined by (8.22) can be written in 
the form 


E=tm(yy's' —yss+8)e”™. 


The factor in parenthesis is bounded and oscillatory, with a constant value 
when averaged over a period of the oscillator. Therefore, the average de- 
crease in energy with time is determined by the exponential factor e “ = e"', 
where 


Tt (8.29) 


il 


als 

y 

is referred to as the lifetime of the oscillator’s initial state of motion. 
(b) Heavy damping is defined by the condition >y > @,. In this case, 
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a=(Gy'-w)!” (8.30a) 
is a positive scalar and the solution (8.26) assumes the form 
YS aes ee (8.30b) 


The first term decays more rapidly than the second one, and the orbit does 
not encircle the equilibrium point as it does in the case of light damping. 

(c) Critical damping is defined by the condition w, = + y. For this case, the 
characteristic equation corresponding to our trial solution has only one 
distinct root, so we get the solution r = ae”, which cannot be the most 
general solution, because it contains only one of the two required constant 
vectors. However, we can find the general solution by allowing the coefficient 
to be a function of time. (This is called the method of variable coefficients). 
After substituting r(¢) = a(t)e “° into the equation of motion (8.24) with 
W, = + y, we find that our trial solution works if 4 = 0, soa = a, + br, where 
a, and b are constants. 

Thus we arrive at the general solution 


r=¢ "(age bf). (8.31) 


The condition for critical damping is unlikely to be met in naturally occurring 
systems, but it is built into certain detection devices such as the galvanometer. 
A detection device may consist of a damped oscillator which is displaced from 
its equilibrium position by an impulsive force (signal). One wants it to have 
sufficient damping to return to the equilibrium position without oscillation so 
it will be ready to respond to another signal as soon as possible. On the other 
hand, for the sake of sensitivity, one wants it to respond significantly to weak 
impulses, which requires that the damping be as light as possible. The 
maximal compromise between these two conflicting criteria is the condition 
for critical damping. 


3-8. Exercises 


(8.1) For an oscillator with orbit 
r(t) = a cos(w,t + ¢,) + b sin(a,t + d), 


determine the major axis a and the minor axis b from the initial 
conditions r, = r(0), v, = r(0). 
Specifically, show that, for 0 < o, <4, 


vee 
a =r, cos o- sin d, 


G2 


Vo 


b =r, sin @ + cos gd, 


0 
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(8.2) 


(8.3) 


(8.4) 


(8.5) 


(8.6) 


while @ is determined by 


tan 2, => 20 iE o'Vo ; 

Vo — @o Fo 

Ancient properties of the ellipse: On the ellipse r(@) = acos @ + b 

sin o, the point s(@) = r(@ + 2/2) is said to be conjugate to the 

point r(@). Prove that the following holds for any pair of conjugate 

points. 

(a) The tangent to the ellipse at r is parallel to the conjugate radius 
s (Figure 8.1). 

(b) The first theorem of Appolonius: 


r tis =a +b’. 
(c) The second theorem of Appolonius: 
ras = aab, 


that is, the parallelograms determined by conjugate radii all 
have the same area. 
Interpret these theorems of Appolonius as conversation laws for an 
oscillator (see Section 3.10). 
Establish the equivalence of Equations (8.18) and (8.16a). Show the 
planes of circular motion are determined from a given ellipse by the 
equations 


i= atb + veh, 


where the eccentricity € is determined by the equation €°a* = a — b’. 
Show that the dihedral angle a between the planes is determined by 


cose ] See. 


Show that the effect of a constant force on a harmonic oscillator is 
equivalent to the displacement of equilibrium point of the oscillator. 
Use this to find a parametric equation for the orbit of an isotropic 
oscillator in a constant gravitational field. 
Find a parametric equation for the orbit of a charged isotropic 
oscillator in a uniform magnetic field, as characterized by the 
equation 

mi = - moir + Le x B. 
(Suggestion: Write r = x + zB where x-B = 0 and separate differ- 

ential equations for x and z.) 

Show that the general solution to the equation 


mr + kr = 0, 


for both positive and negative values of k, can be put in the form 
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r(t) = ¢ cosh z(t), 


where z = + if and i is a unit bivector. Determine the time 
dependence of scalars uw and @ for the two cases. 

Note that r(u, @) = ¢ cosh (tu + id) is a parametric equation for 
an ellipse if 4 is held constant or a hyperbola if @ is held constant. 
Express the major axis a and the minor axis b of the ellipse in terms 
of c and w. For given a and @ determine the directance from the 
origin to each asymptote of the hyperbola. 

Every point r = r(u, @) in the i-plane is designated by unique 
values of uw and @ in the ranges - © << p< ,0 << 2a. The 
parameters u, @ are called elliptical coordinates for the plane. 

(8.7) Evaluate constants a, and a_ in Equation (8.30b) in terms of initial 
conditions r,, ¥,, and sketch a representative trajectory. 


3-9. Forced Oscillations 


In this section we study the response of a bound particle to a periodic force. 
We concentrate on the very important case of a bound charge driven by an 
electromagnetic plane wave. But our results are quite characteristic of driven 
oscillatory systems in general. The properties of electromagnetic waves which 
we use in this section will be established in NFII. 

For a charged isotropic oscillator in an ‘“‘external” electromagnetic field, we 
have the equation of motion 


mi + mojr=q(E+c'r XB). (9:1) 
If the external field is an electromagnetic plane wave, then E’ = B’, so 
Wenn <aBal r | 


“lle 


Hence, for velocities small compared to the speed of light, the magnetic force 
is negligible compared to the electric force. At any point r in space, the 
electric field E of a circularly polarized plane wave is a rotating vector of the 
form 


E(r, t) = E,e@-*, (9.2) 


Here, E, is a constant vector and | E, | is the amplitude of the wave; the 
(circular) frequency of the wave is w = c|k|; the wavelength is A = 2a/ 

k ,, the plane of the rotating vector E is perpendicular to the direction k of 
the propagating wave, and it is specified by the unit bivector i = ik. For 
w > 0, Equation (9.2) describes a /eft circularly polarized plane wave; (for 
wy <Q, it describes a right circularly polarized wave). We are most interested 
in applying (9.2) to a region of atomic or molecular dimensions; such a region 
is small compared to the wavelength of visible light, in which case | k-r | 
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< 2nr/i =~ 0. Furthermore, we shall see that the motion of a driven oscillator 
tends to lie in the plane k-r = 0. For these reasons, it is an acceptible 
approximation, besides being a considerable mathematical simplification, to 
neglect any effect of the factor k-r in (9.2) on oscillator motion. Thus, our 
equation of motion (9.1) assumes the specific form 


i + wer = ee!, (9.3) 


where ¢ = q/m E,. 
To solve (9.3), we try a solution of the formr = Ae“ and, after carrying out 
the differentiation, find that 


Met(A? ar) = se! 


This equation will hold for all values of ¢ if and only if A = iw. The equation 
also determines A uniquely, whence 


—— ee" . 
Ww, - w 
is a particular solution of (9.3). We get the general solution with two arbitrary 
constants by adding to the particular solution the solution of the homo- 
geneous equation ¢f + wir = 0, thus, 
ee" 
r = acos wt + b sin wt +——— . 
W, — wo 
The last term in this equation describes the displacement of the oscillator due 
to the driving force exerted by the electromagnetic wave. It is a rotating 
vector in the EAB plane. Its amplitude is infinite when the driving frequency 
w of the wave matches the natural frequency w, of the oscillator, a condition 
called resonance. As usual, an ‘‘unphysical” infinity such as this points to a 
deficiency in our model of the interacting systems. At resonance it becomes 
essential to take into account the omnipresent resistive forces which other- 
wise might be negligible. 


Forced Oscillator with Linear Resistance 


We can improve our model of an electromagnetic wave interacting with a 
bound charge by adding a linear resistive force. Thus, we consider the 
equation of motion 

r+ yr+ wor =ee". (9.4) 


This equation can be solved in the same way as (9.3). The solution can be 
written as asumr =r, + r, of “transient” and “forced” displacements. The 
transient displacement r, is a solution of the homogeneous equation 


r,+ yr, + wr, = 0, 
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determined by the initial conditions of the oscillator. We have seen that the 
amplitude of this solution decreases with a relaxation time 27", so eventually 
it will be negligible compared to the steady state displacement supported by 
the continually applied driving force. 

The forced displacement r, is the particular solution of (9.4) determined 
entirely by the driving force. Ignoring the transient solution, we write r = r, 
and the insert the trial solution r = ae” into (9.4), with the result 


r(-w? + yiw + w) = ee”. 
Solving for r, we put the solution in the form 
ri) Ace (9.5a) 


with vector amplitude 


= oa & 
A = A(@) = (2 =)? + 70?]" (9.5b) 


and phase angle 


6 = 6(w) = tan! — - : (9-3) 
Wi; - wo 
where 0 < 6 < a accounts for the full range of the parameter w. 

The solution (9.5) shows that r is a rotating vector lagging behind the 
driving force gE by the phase angle 6 = 6(w) (Figure 9.1). Let us examine the 
limiting cases in the range of driving frequencies. 

(a) Low frequencies w < w,. (A scale distinguish- 
ing large from small is determined by the ratio w,/y). 

Equation (9.5c) gives tan 6 ~ 6 = 0, and the solution 


reduces to 
E O r 
i-plane 


; : : : : Fig. 9.1. Phas | 
This shows that resistance is not important in slowly ae ie ce 


driven motion. A gradually varying force gives the  E and the position re- 
oscillator time to respond and follow exactly in phase sponse r. 
(6 = 0). 


(b) High frequencies w > w,. Then tan 6 = -y/w = 0, 6 = a and 


r(t) = == case 


QO 


r(t) = —e™, 
w 


Thus, the response to a rapidly varying force lags behind the force by the 
maximum phase angle 6 = a, and it is weaker in amplitude than the response 
to a slowly varying force. 

(c) Resonance w = w,. Then tan 6 = ©, 6 = a/2 and 
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eie’”’ 
YWo 


Thus, the response at resonance is orthogonal to the driving force, and it is 
stronger than the low frequency response if and only if y < a,. 


r(t)=- 


The squared amplitude A* = | A(@) |? 
and the phase angle 6(w) are graphed 
in Figure 9.2. As the graph shows, 
maximum amplitude is not attained 
at the resonant frequency w,, but 
rather at a resonant amplitude fre- 
quency w, defined by the condition 


dA’ 


— == ({)) 
dw | 


( — toy 


Carrying out the differentiation on 
(9.5b), we find that 


(Cv) 


of Wp = (w;, - aS (9.6) 


W, Wp W_ To get a measure of the width of the 


fig. ¥.2a. Squared amplitude near resonance. €SOnance, we locate the points tv. at 
which A* has half its maximum value. 
After some algebra, we find that 


w= O + (oR+ TY’). (9.7) 
The resonance width Aw is defined by 

Aw=|o,-w_|. (9.8) 
For light damping (y < w,), Equation (9.6) gives w, ~ @,, and (9.7) yields 


+ywr 1 
WO, - Wp =~ ——- = ty. 
T)., Sp = 


Therefore, the resonance width is given by the simple expression 
Aw=y= - ; (9.9) 


where, as was established in the last section, T is the lifetime of the oscillator 
state if the driving force is suddenly removed. 

The inverse relation (9.9) between resonance width and lifetime is a very 
general and important property of unstable bound states of motion. Equation 
(9.5b) shows that the narrower the width, the higher the resonance peak 
(which would be infinite if y = 0). Thus long-lived bound states are character- 
ized by narrow resonances with relatively high peaks. Energy storage and 
dissipation 
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The potential energy of the oscillator has the value } maw,r? = +ma,A*, so 
the graph of A* in Figure 9.2 is equivalent to a graph of potential energy. The 
graph, therefore, describes the relative amount of energy stored in the 
oscillator for the various driving frequencies, and w, is the frequency for 
which the stored 
energy iS a maxi- 
mum. However, as @ 
we have noted be- 
fore, the stored 
energy of an atomic 
system is not directly 
observable. Rather, 
it is the energy ab- 
sorption as a func- 
tion of frequency 0 Wy Go 
that can be directly 
measured. We must 
determine this function before our results can be fully interpreted. 

An isotropic oscillator subject to an arbitrary external force f obeys the 
equation of motion 


Ni 


Fig. 9.2b. Phase angle near resonance. 


mi + wr = f. 


Multiplying this by r, we get the equation 
—=fr (9.10) 


for the rate of energy change (Power) induced by external forces. For the case 
we have been considering, 


=-—myr + gE, 
te) 


& = - mye’ + guer. (9.11) 
The first term on right describes the power loss due to dissipative forces. The 
second term describes the power delivered by the electromagnetic field. For a 
steady state of motion the energy is constant so the power lost must equal 
power supplied, and from (9.11) we conclude that the power supply P is 
determined by 


P= myr? = 2y(>mr). (o712) 


Obviously, this relation applies to any driving force producing a steady state 
of motion. 
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Our analysis shows that a damped oscillator in a steady state of motion 
dissipates energy continuously, but the motion persists because an equal 
amount of energy is continuously supplied by the driving force. According to 
(9.12), the rate at which energy is absorbed and dissipated is proportional to 
the kinetic energy of the oscillator. For the motion described in (9.5), the 
power absorbed as a function of driving frequency is specifically given by 


2myew 


P(w) = (9.13) 


(wi - oF + For 
This function has a maximum at the resonance frequency w,, as the reader 
may verity. Thus, the maximum of the kinetic energy is at the frequency w, 
for which the energy dissipated is a maximum, whereas, as we have seen, the 
maximum of the potential energy is at the frequency w, for which the energy 
stored is a maximum. The general shape of the graph for P(w) is similar to 
that for the potential energy, so it would be repetitious to discuss it. 

Resonance occurs in every oscillatory system, so it is a phenomenon of 
great practical importance. If the damping 1s weak, a small periodic force can 
set up large oscillations. Consequently, it is undesirable to build a boat with a 
natural pitching frequency which might be close to a likely frequency of 
waves. The same basic principle was overlooked in the design of the Tacoma 
Narrows Bridge, which was destroyed when it resonated with periodic wind 
gusts. Undesirable resonances in machinery of all kinds can be avoided by 
damping devices such as shock absorbers in cars. 


The Faraday Effect 


Undoubtedly the most common and important example of resonance occurs 
in the interaction of light with matter. The mathematical theory of optical 
properties of matter was pioneered by H. A. Lorentz. He supposd that the 
atoms in a given material can be regarded as charged harmonic oscillators 
when considering their interactions with light. He was then able to derive 
mathematical expressions for the index of refraction of gases, dielectrics, and 
metals as well as explain a number of other optical properties. Our model of a 
damped oscillator is used in his theory of dielectrics. From the model it is 
clear that incident light with frequencies close to the natural frequencies of 
the atoms will be strongly absorbed, while a dielectric will be transparent to 
light with frequencies outside the range of its natural frequencies. A modern 
introduction to the Lorentz electron theory is given by Feynman (1963, Vol. 
IY, Chap. 32). 

As an application of the Lorentz electron theory, let us see how it explains 
the Faraday effect. In 1845 Faraday showed that the polarization plane of light 
passing through a glass rod will be rotated by an external magnetic field 
directed along the line of propagation. This was the first experimental 
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demonstration of a connection between magnetism and light. Today the 
Faraday effect has important technological applications. And in astronomy, it 
explains the polarization of light passing through the strong magnetic fields of 
pulsars. 

To understand the Faraday effect, we consider the effect of a constant 
magnetic field B on an oscillator driven by a linearly polarized plane wave. 
The equation of motion is 

me + mar = pet X B + gE, cos ot, (9.14) 
where B-E, = 0 for a plane wave propagating along the line of the magnetic 
field. We know that it is unnecessary to include damping terms in order to 
locate resonances, and we know that steady state solutions will be orthogonal 
to the direction of wave propagation. Consequently, we can assume that 
r'B = 0 and rewrite (9.14) in the form 


r—2riw, + wr = ecos at, (9.15) 


where € is defined as before, the bivector i is defined by i = iB, and the 
so-called Larmor frequency w, is defined by 


q|B| 
2mc — 


i = 


(9.16) 


Note that w, = —| @, | for an electron with charge q. The linearly polarized 
plane wave can be expressed as a superposition of left and right circularly 
polarized waves: 


Ecos wt = + e(e + e*'), (9.17) 


where @ is positive. Consequently, the steady state solution must have the 
form 


r=a+ae™, (9.18) 


To determine the amplitudes a, , we substitute (9.18) into (9.15) and equate 
coefficients with the same time dependence, with the result 


1 
7 & 


a: (pean a i) 
This shows that there will be resonance when 
w= (wit w7)'? F w,. (9.20) 


For atomic systems in feasible laboratory fields it can be shown that w, > 
w,. Consequently, for propagation in the direction of B, the left circularly 
polarized component of the wave has a resonance at w = w, + | w,|, while 
the right circularly polarized component has a resonance at w = @, — | @, |. 
The locations of the resonances are interchanged if B is opposite to the 
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direction of propagation. In summary, the effect of a magnetic field is to shift 
the resonance frequencies of circularly polarized waves by opposite amounts 
(ee 

According to the Lorentz theory, the index of refraction for a material 
depends on the locations of its resonances. For more details see Feynman 
(1963). In the present case, the left and right circularly polarized waves 
resonate at different frequencies, so they have different indices of refraction 
and different phase velocities in the medium. The net effect is a rotation of 
the polarization plane of the propagating wave. 


3-9. Exercises 


(ORE) Derive Equations (9.6) and (9.7). Show that the maximum kinetic 
energy and power dissipation is attained at a driving frequency 
equal to the natural frequency of an oscillator. 

(992) Solve the damped isotropic oscillator with a sinusoidal driving force 
é sin wt. Determine the phase angle and the average potential 
energy (where the average is taken over a period of the driving 
force). Compare with the solution (9.5a, b, c). 

(9.3) Determine the velocity v = v(t) of a charged particle in a constant 
magnetic field B and a plane electromagnetic wave with circular 
frequency w. Use your result to explain the fact that electromag- 
netic waves with linear frequencies in the neighborhood of the 
‘cyclotron frequency” w,/2m = 1400 kHz are sharply attenuated 
when passing through the ionosphere. (The charge to mass ratio of 
an electron, | q/m| = 1.76 x 10'' Coul kg". The strength of the 
Earth’s magnetic field in the ionosphere is given by B/c = 5 x 10° 
webers m™”, where c is the speed of light.) 


3-10. Conservative Forces and Constraints 


So far we have studied only specific force laws of the simplest mathematical 
form. To survey the broad range of force laws with physical significance, we 
must procede systematically, classifying forces according to general prin- 
ciples. The general approach which has proved to be most powertul is to 
distinguish forces by identifying conservation laws or constants of motion 
which they admit or disallow. In this section we examine conditions under 
which energy conservation holds and some of its implications for single 
particle motion. 

We have already analyzed energy conservation and dissipation for an 
isotropic oscillator. For a more general analysis of energy conservation, we 
multiply the equation of motion mv = f by v = x to get 
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—(smv’)=vf. (10.1) 


This is an equation for the change of the so-called kinetic energy +mv’ due to 
the action of the force f. If v-f = 0, the kinetic energy is a constant of motion. 
The magnetic force (q/c) v X B has this property, irrespective of how the 
magnetic field B = B(x, t) depends on position and time. Since it never alters 
the energy of a particle, the magnetic force is said to be a conservative force. 

A more general concept of conservative force can be developed by con- 
sidering a force f with the property 


vf=—-x-VV, (10.2) 
where V = V(x, t). According to the identity (2-8.16), 


xVV= ae av , 
so, substitution of (10.2) into (10.1) yields 
<_ (mv + V) = av. (10.3a) 


Hence, the quantity 
E=smv+V, (10.4) 


is conserved if and only if 0,V = 0. We refer the quantity V as the potential 
energy and to E as the (total) energy of the particle. 

The force f is said to be conservative if the associated energy given by (10.4) 
is conserved. There-is no commonly accepted term to refer to the more 
general case when 0,V # 0, for the good reason that (10.3) then gives little 
useful information. However, it should be noted that the explicit time 
dependence of the potential V(x, ft) arises from the motion of ‘‘its source,” 
namely, the particles “producing” the force f = — VV(x, t). We shall see later 
on that the explicit time dependence often disappears when the potential is 
expressed as a function of relative positions of interacting particles so the 
conservative case is more general that it might appear at first. Now, from 
(10.2) we can conclude that a conservative force f has the general form 


=-—VV(x)+N where N-v=0O, (10.5) 


but N = N(v, x, f) can otherwise have any functional dependence on v, x and 
t. Let us refer to N as the normal component of the conservative force, 
because the condition v-N = 0 implies that it is always normal (or perpen- 
dicular) to the particle path. The normal force changes the direction of 
particle motion without affecting the speed (or kinetic energy). We have seen 
that the magnetic force has this property. So do forces of constraint, as we 
shall see below. 
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It is customary to define a conservative force as one which can be put in the 
form f = — VV(x). The more general definition (10.5) has been adopted here 
to emphasize the most general conditions under which energy conservation 
obtains. Of course, by the superposition principle both -VV and N can be 
regarded as distinct forces, and in specific applications they have independent 
sources, so it is perfectly reasonable to consider them separately. 


Work 


It is instructive to put the energy conservation law in integral form, as distinct 
from its differential form (10.3). Integrating from an initial state x, = x(0), 
v, = V(0) to a final state x = x(t), v = v(t), we put (10.1) in the form 


+mv ->mv2 = [ia vf =|" dot (10.6) 
0 Xo 
The integral here is referred to as the work done by force f on the particle in 
the time interval t. Work can be regarded as a transfer of energy from one 
physical system to another — in the present case, from the system producing 
the force f to the particle. The work is positive if the particle gains energy and 
negative if the particle loses energy. 
Now, substituting (10.2) into (10.6) and using (2-8.34), we find that, for a 
conservative force, 


1 my? — Lmy? = - | " dx-V V = V(x,) — V(x). (10.7a) 


Xp 


Thus, though kinetic and potential energies may differ, the total energy 
+ mv? + V(x) = +mv2 + V(x,) (10.7b) 


has the same value for the initial and final states, as well as for all intermedi- 
ate states. From (10.7a) it is evident that energy conservation depends only 
on the potential difference V(x,) — V(x) and not on the absolute value of the 
potential function V(x). Therefore, we are free to assign any convenient value 
to the potential at one point, say x,, and the value at any other point x will 
then be determined by an integral as in Equation (10.7a). 

We have seen in Section 2.8 that an integral like the one in (10.7a) is 
independent of the path between initial and final states. We can conclude, 
therefore, that the work done by a conservative force is path-independent. 
The notion of path independence involves the concept of a force field, which 
is a more general concept of force than we started out with, so a few words 
about force fields are in order. 
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Conservative Fields of Force 


The concept of a force field arises naturally from an examination of the 
possible mathematical forms for the force function f = f(v, x, f) in the 
equation of motion f = mv. A velocity independent force function f(x, ¢) is a 
time dependent vector field, and it can be characterized mathematically 
without reference to any particle on which the force acts. We imagine, then, 
that at each point x there is a time varying vector f(x, ¢), which is the force that 
would be exerted on a particle if there were one at that point. This conception 
of a force field, obtained by separating the concept of force from the concept 
of particle, has proved to be one of the most profound and fruitful ideas in 
physics. Later on we shall discuss implications of attributing an independent 
physical existence to force fields. For the time being, however, the concept of 
force field can be regarded merely as a convenient mathematical abstraction. 
It should be evident, now, that a conservative force f(x) = - VV(x) is 
actually a conservative force field, because its properties, such as path- 
independence, relate values of the function at more than one point. 

The path-independence of the energy conservation law is a major reason 
for its importance. Thus, from (10.7a) we can deduce the change in speed of a 
particle passing from x to x, without bothering to solve the equations of 
motion to determine its path. On the other hand, since kinetic energy is 
necessarily positive, the energy conservation law in the form (10.4) implies 
E — V(x) 2 0, from which we can conclude that a particle with energy E will 
be confined to a region bounded by the surface E — V(x) = 0 whatever its 
trajectory. This shows that the path-independence is limited when the energy 
is assigned a specific value. We will make good use of this fact in the next 
chapter. 

Though the energy is path-independent, the conservative force VV allows 
only a limited selection of paths connecting given points. However, any path 
consistent with energy conservation can be achieved in principle simply by 
specifying an appropriate normal force N. Let us see how the problem of 
finding such a force can be formulated mathematically. 


Surface Constraints 


Suppose the particle is constrained to move on a surface determined by the 
scalar equation 


o(x, t) =0. (10.8) 


In a physical application, the equation of constraint (10.8) might describe the 
surface of a solid body. The explicit time dependence of the equation then 
allows for the possibility that the solid body may be moving. The body will 
exert a force on a particle in contact with its surface. If the surface is 
frictionless, the contact force N will be exerted along the direction of the 
surface normal V@ # 0, so we can write 
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N=AVO, (10.9) 


where the proportionality factor A = A(v, x, ¢) is a scalar function which can 
be determined only by using the equation of motion. 

If x = x(t) is the particle path on the surface, then differentiation of (10.8) 
gives 


p=xVoO+t 16 =0, 
or by virtue of (10.9), 
vN=-Ad. (10.10) 


The quantity v-N is the rate at which the constraining force N does work on 
the particle, and, according to (10.10) it vanishes if and only if 0,@ = 0. 
Therefore, the constraining force is conservative if and only if the surface of 
constraint remains at rest. For this reason, it is appropriate to say that an 
equation of the form ¢(x) = 0 determines a conservative constraint. The more 
general time dependent equation (10.8) is said to determine a holonomic 
constraint. A conservative constraint is therefore a time-independent holo- 
nomic constraint. 

Before completing our general discussion of constrainted motion, let us put 
some flesh on these abstractions by considering some examples. 


EXAMPLE 1: Particle in a Constant Gravitation Field 
As we Shall verify below, the potential energy for a particle in constant 
gravitational (force) field can be written 


V(x) = —mg-x = mgh, (10.11) 
where 
=- x, (10.12) 


is the height of the particle above some arbitrarily chosen “ground level’. 
Note that Equation (10.12) can be interpreted in two ways: either as an 
equation for the height / as a function of position x, or as an equation for a 
1-parameter family of horizontal planes, which are the ‘‘equipotential sur- 
faces” of the gravitational field. The gravitational force is obtained by 
differentiating (10.11) with the help of (2-8.37); thus, 


— VV = mVg’x = mg. 
The energy conservation law (10.7a) now takes the specific form 
+m(v?— v2) = mg:(x—X) (10.13) 


This is equivalent to (2.18), which we found before only after determining the 
general solution to the equation of motion. 

In Section 2-2 we saw that a particle in a constant gravitational field follows 
a parabolic trajectory, and if it is launched with a specific initial speed there 
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are at most two such trajectories connecting a pair of given points. Let us 
consider some alternative trajectories that result from adding holonomic 
constraints. 


EXAMPLE 2: Particle on a Stationary Plane Surface 
A block placed on a fixed frictionless plane will be subject to the equation of 
constraint 


pt) Sx y)n= 0° (10.14) 


which is the equation for a plane with unit normal V@ = n passing through a 
given point y. If the block is regarded as a particle, its equation of motion is, 
in accordance with (10.9), 


mv = mg + An. (10.15) 


Before we can solve this equation, we must determine the magnitude A of the 
force exerted by the plane. Since the plane is at rest, the normal n is constant, 
and, according to (10.10), vn = 0. So, multiplying (10.15) by n, we find that 
A is given by 


0 = mg-n + A. 
Using this to eliminate A from (10.15), we get 
mv = m(g-g-nn) = mg). (10.16) 


Thus, the net force is merely the component of the gravitational force in the 
plane. The general solution of (10.16) is a parabola in the plane, provided of 
course, the initial velocity v, satisfies the condition of constraint v,:n = 0. 
Since the constraint is conservative, the energy conservation law (10.13) still 
applies. 


EXAMPLE 3: Particle on a Moving Plane Surface 

The equation of constraint (10.14) can be generalized to describe a rigidly 
moving plane simply by allowing y = y(t) to be a given function describing 
the motion. It can be further generalized to describe a plane rotating about an 
axis through the point y by allowing the unit normal to be a function of time 
n = n(t). Let us consider implications of the first generalization. The equa- 
tion of constraint is 


(x, t) = (x- y(d))-n = 0. 
Differentiation implies 

¢ =(v-y)n=0 
and 


d= (v-s)n=0. 
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Using this to solve the equation of motion (10.15) for A, we get 


A = m(y -—g)n 
The equation of motion can therefore be written 
v=g+(y-g)-nn (10517) 


This is the generalization of (10.16) for a moving plane. Integration of (10.17) 
is trivial since y = y(t) is supposed to be a given function. Note that the orbit 
is again a parabola if the acceleration y is constant. However, (10.17) will fail 
to apply if the plane moves in such a way as to “break contact” with the 
particle. 


EXAMPLE 4: Particle on a Stationary Spherical Surface 

Now consider a particle constrained to move on the surface of a sphere of 
radius a. If the sphere’s center is chosen as the origin, the equation of 
constraint can be written in the form 


aau—- (0, 
or, in the form 
g(x) =|x|-a=0. (10.18a) 


The second form is a little more convenient because by (2-8.39) its gradient is 
equal to the unit exterior normal to the sphere 


t= V|x/=Vo¢. (10.18b) 
The equation of motion is 

mv = mg + Ax. (10.19) 
Whence, 

A = m(v - g)-X. 


According to (10.10), Equation (10.18b) implies that v-x = 0. Differentiating 
vx = 0, we get vx + v = Oor 


vu" 


vx=-—. (10.20) 
a 
This enables us to express A in the form 
A=- m(e + 2s : (10.21) 


This can be reduced to a function of position alone by using energy conser- 
vation. The result is obtained in terms of initial conditions by using (10.13) to 
eliminate v’ from (10.21); thus, 
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= Ser. — 2g-x, + v2). (10.22) 


Either (10.20) or (10.21) can be substituted into (10.19) to get a well-defined 
equation of motion; however, the result looks complicated, and we shall find 
more convenient ways to express it later on. In the meantime, we can draw 
some significant conclusions directly from (10.22). 

Equation (10.22) allows both positive and negative values for A. However, 
if the particle is a small object on the surface of solid ball, only positive values 
of A are significant, because the force exerted by the ball must be outward. At 
a point where A = 0 the constraining force vanishes, so the particle is no 
longer in contact with the surface. Therefore, from (10.22) we can conclude 
that the particle will break contact with the surface at a height h above the 
“equitorial plane” of the ball given by 


h=-¢@x= + (2-28 | : 
& 
In particular, if the particle starts from rest at the top of the ball, then 
h =3(8%,) = $a. 

If the particle is constrained by the inside instead of the outside of the 
spherical surface, then the constraining force must point inward so A < 0. 
A common example of this kind of constraint is a pendulum consisting of a 
bob supported by a massless flexible string. On the other hand, for a pendu- 
lum consisting of a bob supported by a massless rigid rod, the constraining 
force may be either inward or outward, so A can be either positive or negative. 
Constraints of this kind, which do not allow the particle to leave the surface of 
constraint, are called bilateral. Constraints which confine a particle to one 
side of a surface are called unilateral. The same general equations can be 
applied to both kinds of constraint, but, as we have seen, they must be 
interpreted differently in each case. 

Our analysis of spherical constraints, in particular the derivation and 
application of (10.29), is readily generalized to handle constraints exerted by 


any smooth surface. However, a complete generalization involves the differ- 
ential geometry of surfaces, which is beyond the scope of this text. 


Lagrange’s Equations for Constrained Motion 


We are still faced with the problem of developing a systematic method for 
solving the equations of motion subject to holonomic constraints. This prob- 
lem was solved by Lagrange. His method employs constraints expressed in 
parametric form rather than the nonparametric form (10.8). 

The parametric equation for a surface has the form x = x(q,, g,, tf), where 
q, and q, are independent scalar parameters (or coordinates), and the explicit 
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t dependence allows for the possibility that the surface is moving. For fixed q, 
and variable q,, the parametric equation describes a “coordinate curve”’ on 
the surface with a tangent vector e, at each point defined by 


e, = a,x, (10.23) 


where d,, denotes the derivative with respect to g,. Similarly, the variable q, 
determines a tangent vector e, at each point on the surface, as shown in Figure 
10.1. Both cases are for covered by the ‘‘free index notation”’ 


N e,, = 0,,X (10.23) 
where a = 1, 2. 
- The equation of motion 


has the form 
mx = F +N, (10.24) 


where N is the force of 
Fig. 10.1 Coordinate curves and tangent vectors on a sur-- constraint and F is some 
face of constraint. : : 
given external forces. Since 

N is normal to the surface of constraint, we have 
N-e, = 0. (10.25) 


Therefore, multiplication of (10.24) by the e, gives us independent compo- 
nents of the equation of motion on the surface. 


mx-e, = F-e,. (10.26) 


These equations can be expressed as a set of differential equations for the 
coordinates qg, alone. This is merely an exercise using the chain rule of 
differentiation. 

The trajectory of the particle on the surface of constraint is described by the 
parametric equation 


x = x(q,(4), 42(¢), 0). 
Therefore, its velocity is given by the parametric equation 

k= 2G,04,x + aX, (10.27) 
where the'sum is over all values of a. Hence 

ak = 07x — 
and 

e, oe (24,4, nS my) dg,X a OqX 


because 4,9, = 0 for all values of a and B. We can use these facts to rewrite 
the left side of (10.26); thus, 
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mse, = m{ 2 (ee )-x-e 
dt se ol 


But, 

Xe = 304 =. 0). 
and 

KO, = X(I4,.%) = 74,(X’)- 
Hence 


: dy 
mx-e, = a (4 nO) eat I 


where K = +x’ is the kinetic energy of the particle. 
Now introducing the notation 
O, = hee, —F (4, x) (10.28) 
for the component of force in the e,, direction, we can write (10.26) in the 


form 


dae 
Gy a) ots = Or (10.29) 


This is called Lagrange’s equation. There is one such equation for each of the 
coordinates qg,,. To use Lagrange’s equation, it is necessary to express K and 
Q., as functions of q,, and q,, by using the parametric equations of constraint 
x = x(q, t), where, for brevity, the symbol q is used to denote the whole set of 
coordinates q,,. Thus, using (10.27), we find 


Ks mx - 7 2 2 Bip p a5 = bq. os (10.30) 


where the coefficients are functions of the coordinates given by 
Aap = Gog (G, t) = me,'e,, 
b, = b, (q, t) = me,: 4.x, (10.31) 
c= cq.) =x). 


If the external force is conservative so that F = — VV(x), then the equation 
of constraint gives us V = V(x(q, t)) = V(q, ¢), and 


OF Spert (a, x) vv ee On. 


In this case, Lagrange’s equation (10.29) can be written in the form 


(0 gL) ~ Gal = 9, (10.32) 
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where the so-called Lagrangian L = L(q, q, ¢) is defined by 
L=K-V (10.33) 


Once Lagrange’s equations have been solved for the functions g, = q, (t), 
the result can be substituted in the equation of constraint to get an explicit 
equation for the orbit x(t) = x(q(¢), ¢). Lagrange’s method has the advantage 
of totally eliminating the need to consider forces of constraint, which are 
often of no interest in themselves. However, once the orbit has been found by 
Lagrange’s method, the force of constraint can be computed directly from 
N = mx —F. Just the same, our previous method using a nonparametric 
constraint is often a more efficient way to find N. 

It will be noted that our derivation of Lagrange’s equation is actually 
independent of the number of coordinates in the equation of constraint. It is 
only necessary to sum over the appropriate number of coordinates in (10.27) 
and (10.30). If the particle is constrained to a curve rather than a surface, then 
the parametric equation involves only one coordinate and only one Lagrange 
equation is needed to determine the motion. On the other hand, if there are 
no constraints, then three coordinates are needed and Lagrange’s method is 
simply a way of writing the equivalent of Newton’s equation in terms of 
coordinates. 


EXAMPLE 5: The Plane Pendulum 

Now let us consider an application of Lagrange’s method. The best way of 
writing a parametric equation for a sphere will not be evident until we have 
discussed rotations in chapter 5. So let us limit our considerations here to the 
special case of a particle constrained to a vertical circle. This is the so-called 
simple pendulum. The equation of constraint can be written in the parametric 
form 


X= x() ne 


where, as shown in Figure 10.2, the constant a 
is the vertical radius vector, and i is the unit 
bivector for the plane of the circle. The exter- 
nal force is the conservative gravitational 
force, so we can get Lagrange’s equation from 
a Lagrangian. Our first task is to express the 
Lagrangian as an explicit function of @ and @. 
The equation of constraint gives 


i-plane 


x = xid = — ix. 
Hence, 
Fig. 10.2. Vector diagram for k= +m(xid)’ = >ma¢’, 


the simple pendulum. 


where a = |x| =|a|. Also, 
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V = — (gx) = — (gae’*) = — gate”) = — ga cos o. 
Hence 

L=+m@’¢ + mga cos ¢. 
Now 

a;,L = mao, 

0, = — mga sin . 


So Lagrange’s equation 
d 
—G (oek) - dL = 0. 


takes the explicit form 
map + mga sin @ = 0. 


The general solution of this equation involves elliptic functions. However, for 
small oscillations the approximation sin @ ~ @ is often satisfactory, in which 
case Lagrange’s equation reduces to 


d+ o=0. 


This is the familiar equation for harmonic motion, so we can write down the 
solution at once. If @ = 0 at time ¢ = 0, the solution is 


gp = SIN Wot 


where w, = (g/a)'*. Therefore, the parametric equation for the orbit is 
x(t) = ae’ sin (oye 


We shall study the pendulum from other points of view later on. 


Frictional Forces 


Let us conclude this section with a suggestion on how to account for friction. 
An object sliding on the surface of a solid body is subject to a frictional force 
F described by the empirical formula 


where v is the velocity relative to the surface, u is a constant called “the 
coefficient of sliding friction’, and N = | N | is the force of constraint normal 
to the surface. The method of Lagrange by itself cannot handle such a 
frictional force, because N is not a known function. However, we can 
evaluate N from Newton’s Law, just as we have done before. Indeed, for a 
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particle on a spherical surface, we can use (10.21) to get the frictional force in 


the form 


F=% =v + gx). (10.35) 


This can be used in the Lagrange equations to determine the orbit on the 
sphere. However, approximation methods must be used to solve the resulting 
equations for even the simplest problems. 


3-10. Exercises 


(10.1) 


(10.2) 


(10.3) 


(10.4) 


The bob on a pendulum with flexible support of length a moves with 
speed vu, at the bottom of a vertical circle. What is the minimum 
value of u, needed for the bob to reach the top of the circle? Why 
can’t this result be obtained by energy conservation assuming speed 
v = 0 at the top? 

For a particle subject to a force f; 

(a) Show that if a is a constant vector, then f-a = 0 implies that v-a 
is a constant of motion. What does this imply about the trajec- 
tory? 

(b) Find a constant of motion when aaf = 0. 

(c) Show that if v,af = 0, then the particle moves in a straight line. 
Though V is a constant of motion here, this would not be called 
a conservation law, because it holds only for particular values of 
the initial velocity v,. 

A bead is constrained to move on a frictionless right circular helical 

wire described by the parametric equation 


x(0) = ae’? + bé. 


The wire is placed upright in a constant gravitational field (Figure 
10.3). Evaluate and solve Lagrange’s equation for 6. Determine 
how the height of the bead varies with time, and compare it with the 
motion of a particle on an inclined plane. 

A bead moves on a frictionless hoop rotating in a horizontal plane 
with a constant angular speed w about a fixed point on the hoop. 
Show that the bead oscillates about a diameter of the hoop like a 
simple pendulum of length g/w’. Begin by establishing the equation 
of constraint 


x (op, teraee'(1 +e”), 


as suggested by Figure 10.4. 
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(10.5) 


(10.6) 
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0 
Fig. 10.3. Bead on Fig. 10.4. Bead on a rotating 
a helical wire. hoop. 


The equation of constraint for a particle in a plane can be written 
x Sexe) — rere. 


The variables r and @ are polar coordinates. Express the kinetic 
energy in terms of these variables, and determine the form of the 
Lagrange equations if the particle is subject to a arbitrary external 
force F. Derive the same equations directly from Newton’s law. 

For a conservative force, expand the potential V(x) in a Taylor 
series about an equilibrium point x,, and show that, in the first 
approximation, it is equivalent to the anisotropic oscillator potential 


V(r) = —tr-Lir), 
where r = x — x,, and 
L(r) = - VV(r) 


is the linear binding force characterized by Equation (8.20). The 
general shape of an equipotential V(r) = constant is an ellipsoid. 
Use Equation (8.20) to show the energy of the oscillator has the 
constant value 


| +(k,a + k,a? + k,a). 


Chapter 4 


Central Forces and Two-Particle Systems 


The simple laws of force studied in Chapter 3 are said to be phenomenological 
laws, which is to say that they are only ad hoc or approximate descriptions of 
real forces in nature. As a rule, they describe resultants of forces exerted by a 
very large number of particles. A fundamental force law describes the force 
exerted by a single particle. The simplest candidates for such a law are central 
forces with the particle at the center of force. This is reason enough for the 
systematic study of central forces in this chapter. And it should be no surprise 
that the results are of great practical value. 

The investigation of fundamental forces is actually a two-particle problem, 
for, as Newton’s third law avers, a particle cannot act without being acted 
upon. Fortunately, the two-particle central force problem can be reduced to a 
mathematically equivalent one-particle problem, greatly simplifying the so- 
lution. However, a complete description of central force motion must include 
an account of the ‘‘two-body effects” involved in this reduction. 


4-1. Angular Momentum 


We have seen that motion of a particle in any conservative force field is 
characterized by a conserved quantity called energy. Now we shall show that 
general motion in a central force field is characterized by another conserved 
quantity called angular momentum. Then we shall derive the basic properties 
of angular momentum which will be helpful in a detailed analysis of central 
force motion. 

A force field f = f(x) is said to be central if it is everywhere directed along a 
line through a fixed point x’ called the center of force. This property can be 
expressed by the equation 


(x—x’)af=raf=0, (lili) 


where r = x —x’ is introduced as the convenient position variable with the 
center of force as origin. The angular momentum L about the center of force 
for a particle with mass mis defined by 
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L = mr ar = m(x-x’) ax. (152) 
It is customary to define the angular momentum as a vector quantity 
l=mr Xr. (1.3) 


However, the bivector L is more fundamental than the vector | and will be 
somewhat more convenient in our study of central forces. In any case, it is 
easy to switch from one quantity to the other, because they are related by 
duality; specifically, 


L=il. (1.4) 


We shall use the term angular momentum for either L or | and add the term 
‘“bivector’’ or ‘‘vector”’ if it is necessary to specify one or the other. 
Now, from the equation of motion we have 


raf=ra(mr) = < (mr ni), 


because rar = 0. Hence, 
raf=0 ifandonlyif L=0, (1.5) 


that is, the angular momentum is conserved if and only if the force is central. It 
should be noted that this conclusion holds even if the force is velocity 
dependent, though central forces of this type are not common enough to 
merit special attention here. 

Angular momentum has a simple geometrical interpretation. According to 
Equation (2-8.26), the directed area A = A(t) swept out by the radius vector 
r in time fis given by 


1 r(¢) te ‘ 

A(t) => radr= 7] rardt. (1.6) 
r(0) 0 

Therefore, the rate at which area is swept out is determined by the angular 

momentum according to 


; 2 te 
A= srar=—. v7 
Prag == (1.7) 
For constant angular momentum this can be integrated immediately, with the 
result 


A(t) = = Lt. nee) 


Thus, we conclude that the radius vector of a particle in a central force field 
sweeps out area at a constant rate. This is a generalization of Kepler’s Second 
Law of planetrary motion. 

The orbit of a particle in a central field lies in a plane through the center of 
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force with direction given by the angular momentum; for, from (1.2) we 
deduce that every point r on the orbit satisfies 


raL = 0, (1.9) 


which, as we have seen in Section 2-6, is a necessary and sufficient condition 
for a point r to lie in the L-plane. 

If the orbit in a central field is closed, then the particle will return to its 
starting point in a definite time T called the period of the motion. From (1.6) 
and (1.8), we conclude that the period is given by 


A(T) =t$ ende = 5 wr. (1.10) 


As we shall see, this formula leads to Kepler’s third law of plantary motion. 

A major reason for the importance of angular momentum is the fact that it 
determines the rate at which the radius vector changes direction. To show 
this, we differentiate r = rf to get 


r=7fr + 7. (1.11) 
Whence, 

TAP —/TAt =7 it. (iP) 
or 

: rL ie Lr 

r = =- ——_- Tas 

7 mr? m|r/ mr G3) 


When L is constant, this gives f as an explicit function of r. Substituting (1.13) 
into (1.11), we get the velocity in the form 


e=e(r+=)=(+-=}p (1.14) 


For any central force motion, this can be used to determine the velocity as a 
function of direction r = v(f) whenever the orbit is expressed as an equation 
of the form r = r(f#) specifying the radial distance as a function of direction. 

For planar motion, we can express the radial direction f in the parametric 
form 


re”, (1.15) 
where é is a fixed unit vector, i = L is the unit bivector for the orbital plane, 
and @ = @(t) is the scalar measure for the angle of rotation. Differentiating 
(1.15) and equating to (1.13), we get 

L 
mr 


f=fO=Ff 


2 
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So, 
——, (1.16) 
mr 
Whence 
ies Uae (a7) 


The derivation of this result assumes only that L is constant. If | L | is a 
constant also, then (1.17) implies that 6 = 0 always, so @ = @(t) increases 
monotonically with time if | L | # 0. Therefore, the orbit of a particle in a 
central field never changes the direction of its circulation about the center of 
force. We could also have reached this conclusion directly from (1.13). 


4-1. Exercises 


le) Prove that L = 0 implies that the magnitude | L | and the direction 
L of the angular momentum L are separately constants of motion. 


4-2. Dynamics from Kinematics 


The science of motion can be subdivided into kinematics and dynamics. 
Kinematics is concerned with-the description of motion without considering 
conditions or interactions required to bring particular motions about. Dy- 
namics is concerned with the explanation of motion by specifying forces or 
other laws of interaction to describe the influence of one physical system on 
another. 

If the dynamics is known, the kinematics of particle motion can be deter- 
mined by solving the equation of motion. However, the converse problem of 
determining dynamics from kinematics is far more difficult, and it is rarely 
solved without considerable prior knowledge about force laws likely to be 
operative. Historically one of the first and still the most significant solution to 
such a problem was Newton's deduction of the law of gravitation from Kepler’s 
laws. Let us see how the problem can be formulated and solved in modern 
language. 

Kepler’s Laws of Planetary Motion can be formulated as follows: 

(1) The planets move in ellipses with the sun at one focus. 

(2) The radius vector sweeps out equal areas in equal times. 

(3) The square of the period of revolution is proportional to the cube of the 
semi-major axis. 

The first and second laws are illustrated in Figure 2.1; the elliptical orbit is 
divided into six segments of equal area, showing how a planet’s speed 
decreases with increasing distance from the sun. 


Dynamics from Kinematics 199 


After the discussion in the last 
section, we recognize Kepler’s 
second law immediately as a 
statement of angular momen- 
tum conservation, and we con- 
clude that the planets move in a 
central field with the sun at the 
center of force. For constant 
angular momentum, we can 
compute the acceleration of a 
particle by using (1.13) to dif- 
ferentiate the expression (1.14) 
for its velocity; thus, 


Fig. 2.1. Kepler’s first and second laws. 


p= é(e+ E) +e(r- a | 
mr mr 
=| (++ -)+r- a} 
r mr mr 
from which we obtain 
Me 
= eee a 11 
mrt [mi le (2.1) 


The right side of this equation has the form of a central force as required, and 
we can determine the magnitude of the force by evaluating the coefficient. 

Recalling the equation for an ellipse discussed in Section 2.6, we can 
express Kepler’s first law as an equation of the form 


4 
= a2 
é Peet Ge) 


where (is a positive constant and e is a fixed vector in the orbital plane. As an 
aid to differentiating (2.2), we multiply (1.13) by e and note that, since 
eAL = 0, the scalar part of the result can be written 


of eee (2.3a) 
mr- mr- 


while the bivector part has the form 


yee (2.3b) 


let _ earl 
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Differentiating again, we obtain 


a cf (1 | L? 


which, since L? = — L’, gives 

mP — _ = -4, (2.4) 
where 

k= E > 0. (235) 


Comparison of (2.4) with (2.1) leads to the attractive central force law 


es (2.6) 
2 


Thus, we have arrived at Newton’s inverse square law for gravitational force. 
The problem remains to evaluate the constant k in terms of measurable 
quantities. Kepler’s third law can be used for this purpose. 

According to (1.10) the period T of a planet’s motion depends on the area 
enclosed by its orbit. The area integral in (1.10) is most easily evaluated by 
taking advantage of the fact, proved in Section 2-8, that it is independent of 
the choice of origin. As we have seen before, with the origin at the center 
instead of at a focus, an ellipse can be described by the parametric equation 


x =acos¢+bsin ¢, (2.7) 
where a = | a| is the semi-major axis referred to in Kepler’s Third Laws. 
Using this to carry out the integration, we get 

250 
A=tpeadr=bpandx=+ | xA-Gedo = mab. (2.8) 
0 « 


According to (1.10), therefore, the period is given by 


_ 2amab _ 2mmab 


L jb 
Squaring this, using (2.5) and the fact that b? = af (Exercise (2.2)), we obtain 


T 


(2.9) 


ee = 42? &m? a 
a iE 


a. 7 
mT (2.10) 


Kepler's third law says that this ratio has the same numerical value for all 
planets. Therefore, the constant k must be proportional to the mass m. Also, 
note the surprising fact that (2.10) implies that the period T does not depend 
on the eccentricity. 
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Universality of Newton’s Law 


We have learned all that Kepler’s laws can tell us about dynamics. Having 
recognized as much, Newton set about investigating the possibility that the 
inverse square law (2.6) is a universal law of attraction between all massive 
particles. He hypothesized that each planet exerts a force on the Sun equal 
and opposite to the force exerted by the Sun on the planet. From Kepler’s 
third law, then, he could conclude that the constant k is proportional to the 
mass M of the Sun as well as the mass m of the planet, that is, 


k = GmM, (2.11) 


where G is a universal constant describing the strength of gravitational 
attraction between all bodies. Substitution of (2.11) into (2.10) gives 


= (2.12) 


The constant G can be determined ‘by measuring the gravitational force 
between objects on Earth, so (2.12) gives the mass of the Sun from astro- 
nomical measurements of a and T. 

Kepler presented his three laws as independent empirical propositions 
about regularities he had observed in planetary motions. He did not possess 
the conceptual tools needed to recognize that the laws are related to one 
another, or indeed, to recognize that they are more significant than many 
other propositions he proposed to describe planetary motion. Though we 
have seen how to infer Newton’s universal law of gravitation from Kepler’s 
laws, and conversely, we can derive all three of Kepler’s laws from Newton’s 
law, it would be a mistake to think that Newton’s law is merely a summary of 
information in Kepler’s laws. Actually, Kepler’s laws are only approximately 
true, and they can be derived only by neglecting the forces exerted by the 
planets on one another and the Sun. But we shall see that the appropriate 
corrections can be derived from Newton’s law of gravitation, which is so close 
to being an exact law of nature that only the most minute deviations from it 
have been detected. These deviations have been explained only in this 
century by Einstein’s theory of gravitation. 


Epicycles of Ptolemy 


Long before Kepler proposed elliptical orbits about the Sun, Ptolemy de- 
scribed the orbits of the planets as epitrochoids centered near the Earth. It is 
of some interest, therefore, to deduce the force required to produce such a 
motion. An epitrochoid is a curve described by the parametric equation 


Mj =m i= se" ae, (2513) 


where 
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ow, aa, = 0. (2.14) 


and w,:w, > 0. If, instead, w,-w, < 0, the curve is called a hypotrochoid or 
retrograde epitrochoid. Equation (2.14) is a superposition of two vectors 
rotating with constant angular velocities in the same plane. 

In astronomical literature, the circle generated by one of these vectors is 
called the epicycle while the circle generated by the other is called the 
deferent, and the orbit is traced out by a particle moving uniformly on the 
epicycle while the center of the epicycle moves uniformly along the deferent. 
A constant vector could be added to (2.13) to express the fact that Ptolemy 
displaced the center of the planetary orbits slightly from the earth to account 
for observed variations in speed along the orbits. 

Now to deduce the force, we differentiate (2.13) twice; thus, 


F=o,Xr,+o,Xr,, 
i = o, X (@, Xr,) + @, X (@, Xr). (2.15) 


We must eliminate r, and r, from this last expression to get a force law as a 
function of r and r. To do this, note that 


(w, + w,) XF =o, X (@, Xr.) + @, X (W, X r,)-O,'o,r + W,0,"r, + 
= @,@, Tr, . 
But the last two terms vanish when we apply the condition that r, and r, be 


orthogonal to a common axis of rotation along w, and w,. Consequently we 
write (2.15) in the form 


fF=o@Xr+ kr, (2.16) 
where the vector 
o=o,+ o, (2.17a) 


and the scalar 
k=o0,'0, (2.17b) 


can be specified independently. 

The force law expressed by (2.16) does indeed arise in physical appli- 
cations, though not from gravitational forces. For k <0, (2.16) will be 
recognized as the equation for an isotropic oscillator in a magnetic field, 
encountered before in Exercise (3-8.5). Equation (2.16) with k > 0 describes 
the motion of electrons in the magnetron, a device for generating microwave 
radiation. 


4-2. Exercises 


(Qa) Carry out the integration in Equation (2.8) to determine the area of 
an ellipse 
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(2.2) 


(2.3) 


(2.4) 


(2.5) 


(2.6) 


To relate the constants in the two different Equations (2.2) and 
(2.7) for an ellipse (See Figure 2-6.14a), 
(a) evaluate r at the points x = +a to show that 


( = a(1 - €’), 


(b) and that ae = | a| | @ | is the distance from the center of the 
ellipse to the foci; 

(c) evaluate r at x = b to show that b? = (1 - €”)a? = af 

Show that for a circular orbit vy = rt, and that for circular motion 

under a force f, 


mv? 


r 


fr = fr = —- 


Show further, that if Kepler’s third law is satisfied, then f must be a 
central attractive force varying inversely with the square of the 
distance. An argument like this led Robert Hooke and others to 
suspect an inverse square gravitational force before Newton; how- 
ever, they were unable to generalize the argument to elliptical 
motion and account for Kepler’s first two laws. 

To establish the universality of his law of gravitation, Newton had to 
relate the laws of falling objects on the surface of the Earth to the 
laws of planetary motion. He was able to accomplish this after 
establishing the theorem that the gravitational force exerted by a 
spherically symmetric planet is equivalent to the force that would be 
exerted if all its mass were concentrated at its center. It follows, 
then, that the readily measured gravitational acceleration at the 
surface of the Earth is related to the mass M and radius R of the 
Earth by 


Even without taking into account the oblateness and rotation of the 
earth, this value for g is accurate to better than one percent. Newton 
also knew that the radius of the Moon’s orbit is about 60 times the 
radius of the Earth, as the Greek’s had established by a geometrical 
analysis of the lunar eclipse. And he possessed a fairly good value 
for the radius of the Earth: 


R = 6.40 x 10° m. 


Use these facts to calculate the period of the Moon’s orbit and 
compare the result with the observed value of 27.32 days. 

Find the period and velocity of an object in a circular orbit just 
skimming the surface of the Earth. 

Find the height above the earth of a ‘synchronous orbit’’, circling 


20H OO cea oer eee 


the Earth in 24 hours. Such orbits are useful for ““communications 
satellites.” How many such satellites would be needed so that every 
point on the equator is in view of at least one of them? 

(2.7) Estimate the Sun-Earth mass ratio from the length of the year and 
the lunar month (27.3 days), and the mean radii of the Earth’s orbit 
(1.49 x 10° km) and the moon’s orbit (3.80 x 10° km). 

(2.8) Solve Equations (2.17a, b) subject to (2.14) for w, and @, in terms 
of w and k. Show thereby that every epitrochoid and every hyper- 
trochoid can be regarded as an ellipse precessing (i.e. rotating) with 
angular velocity +, and determine those values of w and k for 
which such motions are impossible. 

(2.9) What central force will admit circular orbits passing through the 
center of force? What is the value of L at r = 0? 

Hint: Show that r = 2a-f is an appropriate equation for such an 
orbit. 

(2.10) | Show that the central force under which a particle describes the 
cardioid 


r=a(1 + af) 


3a" 


4 


f=- Oe 


mr 


(2.11) Show that the central force under which a particle describes the 
lemniscate 


7 = an). 


BYiRs Be 
— 


mr 


4-3. The Kepler Problem 


The problem of describing the motion of a particle subject to a central force 
varying inversely with the square of the distance from the center of force is 
commonly referred to as the Kepler Problem. It is the beginning for investi- 
gations in atomic theory as well as celestial mechanics, so it deserves to be 
studied in great detail. Basically, the problem is to solve the equation of 
motion 


mi = mi = -£e. (3.1) 
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The “coupling constant” k depends on the kind of force and describes the 
strength of interaction. As we saw in the last section, in celestial mechanics 
the force in (3.1) is Newton’s law of gravitation and k = GmM. In atomic 
theory, Equation (3.1) is used to describe the motion of a particle with charge 
q in the electric field of a particle with charge q’; then k = —qq’, and the foree 
is known as Coulomb’s Law. The Newtonian force is always attractive 
(k > 0), whereas the Coulomb force may be either attractive or repulsive 
(k < 0). 

There are a number of ways to solve Equation (3.1), but the most powerful 
and insightful method is to determine its constants of motion. We have 
already seen that angular momentum is conserved by any central force, so we 
can immediately write down the constant of motion 


L = mrav = mrt. (2) 
When L # 0, we can use this to eliminate Ff from (3.1) as follows: 
KE 


Ly = - —+? = ke. 
mr 


Since L 1s constant, this can be written 
d 
— (Lv — &Af) = 0. 
ci aoa 


Therefore, we can write 
Ly = k(# + 8), (3.3) 


where e is a dimensionless constant vector. It should be evident that this new 
vector constant of motion é is peculiar to the inverse square law, distinguish- 
ing it from all other central forces. This constant of motion is called the 
Laplace vector by astronomers, since Laplace was the first of many to discover 
it. It is sometimes referred to as the ‘““Runge-Lenz vector” in the physics 
literature. We shall prefer the descriptive name eccentricity vector suggested 
by Hamilton. 

Since L = il, Equation (3.3) can be expressed in terms of the angular 
momentum vector I, with the result 


vX 1= k(F + 8), 


along with the condition I-v = 0. However, Equation (3.3) is much easier to 
manipulate, because the geometric product Lv is associative while the cross 
product I X vy is not. 


Energy and Eccentricity 


Besides L and ¢, Equation (3.1) conserves energy, for 
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so the force is conservative with potential —kr '. It follows, then, from our 
general considerations in Section 2-10, that the energy 


ey -=, (3.4) 


is a constant of the motion. However, if L # 0, this is not a new constant, 
because E is determined by L and e. Thus, from (3.3) we have 


ke? = (Lv — kf)? = L*v? - k(Lvé + Lv) + k’. 
But, by (3.2) 


t 2 
(Lvf + fLv) = 2LVAF = BUY ane 
mr mr 


Hence, 
2 D 2 2 2k 
k*(e — 1) = L? Ge 2k) 
The last factor in this equation must be constant, because the other factors are 


constant. Indeed, using (3.4) to express this factor in terms of energy, we get 
the relation 


(3.5) 


When L = 0, this relation tells us nothing about energy, and our derivation of 
the energy equation (3.4) from (3.2) and (3.3) fails. But we know from our 
previous derivation that energy conservation holds nevertheless. Indeed, for 
L = 0, Equation (3.2) implies that the orbit lies on a straight line through the 
origin, and the energy equation (3.4) must be used instead of (3.3) to describe 
motion along that line. 


The Orbit 


For L = 0, the algebraic equations (3.2) and (3.3) for the constants L and ¢ 
determine the orbit and all its geometrical properties without further inte- 
grations. This is to be expected, because we know that the general solution of 
(3.1) is determined by two independent vector constants. Now, to find an 
equation for the orbit, we use (3.2) to eliminate v from (3.3). Thus, 


k(¢ + #)r = Lvr. 
The scalar part of this equation is 


LL! Es 


k r+ =f = = 
(er +r) VAr Fs 
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For L # 0, this yields, 


+ 
ae eee ’ (3.6a) 
l er 
where, 
IL 
c= ; 
s | z | (3.6b) 


so the sign in (3.6a) distinguishes between attractive and repulsive forces. As 
we have noted in Section 2-6, Equation (3.6a) describes a conic with eccen- 
tricity € = | e | and an axis of symmetry with direction @. The relation (3.5) 
enables us to classify the various orbits according to values either of the 
geometrical parameter € or the physical parameter £, as shown in Table 3.1. 


TABLE 3.1. Classification of Orbits with L # (0). 


Geometrical name Eccentricity Energy Hodograph 
center 
| k | 
Hyperbola e>1 Ea) es ae 
= = Spal 
Parabola e=1 E=0 Ue cared 
Elli 0< Lk 
ipse << l eal) u< aa 
Circle e=(0 E=- a n=) 


The Hodograph 


Since the orbit equation (3.6) is a consequence of the equation for eccentricity 
conservation (3.3), we should be able to describe the motion directly with 
(3.3) itself. Indeed, we can interpret (3.3) as a parametric equation v = v(f) 
for the velocity as a function of direction by writing it in the form 


v= é (Case) (3.7) 
This equation describes a circle of radius k/L centered at the point 
u=kL"s. (3.8) 
In standard non-parametric form, the equation for this circle is 
2 
eee (3.9) 
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Since the center of the circle is determined by the eccentricity vector in (3.8), 
the distance u = | u | of the center from the origin can be used to classify the 
orbits, as shown in Table 3.1. Thus, the orbit is an ellipse if the ongin is inside 
the circle or a hyperbola if the origin is outside the circle. For an elliptical 
orbit the hodograph described by (3.7) is a complete circle, as shown in 
Figure 3.1. The hodograph and the orbit are drawn with common directions 
in the figure, so for any velocity v on the hodograph, the position r on the 
orbit can be determined, or vice-versa. Of course, the relations between v, rf 
and e shown on the figure are expressed algebraically by (3.7). 


Hodograph 


= h|2—- 


=——_—_ lae—— 


Ea 


a 
Fig. 3.1. Elliptical orbit and hodograph. 


The hodograph for hyperbolic motion is shown in Figure 3.3a, and should 
be compared with the corresponding orbit in Figure 3.3b. The figure shows 
that, in this case, the hodograph is only a portion of a circle. This fact cannot 
be expressed by (3.9), but (3.7) implies that v is restricted to a circular arc if F 
is so restricted. Details of the hyperbolic motion will be worked out below. 


The Initial Value Problem 


We have seen how the orbit and its hodograph are determined by the 
equations for angular momentum and eccentricity conservation. The same 
equations can be used to calculate the constants L and « for objects of known 
mass when the velocity v is known at one point f. This solves the initial value 
problem, a basic problem in celestial mechanics. The main part of the 
problem is to determine the eccentricity and orientation of the orbit, that is, 
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to determine the eccentricity vector e. From (3.3) we immediately get 


_ Lv 


le 


~f= Pau hae (3.10) 
This determines e from v and Ff, but comparison with observation will be 
facilitated if this result is expressed in terms of angles. The angle a between 
the velocity and the radial direction can be introduced by 


rv = el, Galt) 
Since the bivector i specifies a specific orientation for the orbital plane, this 
equation will describe orbits with both orientations if the angle has the range 


0 < a < 27. Now we can write the first term on the right side of (3.10) in the 
form 


TS Ee At = iP Asin a 
k k ‘ 
where 
2 
A= a ==. (3.12) 
Consequently, Equation (3.10) can be written in the form 
e=ivAsina - fF. (G43) 
9 The determination of ¢€ by this equation is illus- 


trated in Figure 3.2* According to (3.12), the 
parameter A is determined by the speed and 
radial distance, or, if you will, the ratio K/V of 
kinetic to potential energies. If desired, the right 
side of (3.13) can be expressed in terms of or- 
thogonal components by using (3.11) to elimin- 
ate V, with the result 


e=f(Asin a-1)+iFAsinacosa. (3.14) 


- This can be used to compute the angle 6 between 
Fig. 3.2. Determination of the the radial direction and the direction to the 
ae pericenter of the orbit. We write 

er = ce'®. (3.15) 


then, from (3.14) we obtain 


lear .. A sin @ cos @ (3.16) 


tan 9 = — - 
er 1-A sin’ a 


*Graphical methods for constructing orbits based on Equation (3.13) are discussed by W. G. 
Harter, Am. J. Phys. 44, 348 (1976). 
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Scattering 


Our solution of the initial value problem works for both bounded and 
unbounded orbits. However, for unbounded motion we are often interested 
instead in the scattering problem, which can be formulated as follows: Given 
the angular momentum L and the initial velocity v, of a particle approaching 
the center of force from a great distance, find the final velocity v, of the 
particle receding from the center of force at a great distance. A particle that 
has thus traversed an unbounded orbit is said to be scattered. 

The scattering problem for hyperbolic motion can be solved by applying the 
conservation laws in the asymptotic region (defined by the condition that the 
distance from the center of force be very large). From energy and angular 
momentum conservation, we have 


1 
E=, Oe = i, 
and 
AY eS ———> I 
r ae 


The first condition implies that initial and final speeds are equal, that is, 


2a 
ve [vol = ley = (= | (3:17) 
The second condition implies that 
f,=-v¥, and f,=V¥,. (3.18) 


The Equation (3.3) for eccentricity conservation can be cast in the equiv- 
alent form 


Lv, —ké, = Lv,—ké,, (3.19) 


where r, and r, are any two points on the orbit. Using the asymptotic relations 
(3.17) and (3.18) in this equation, we deduce 


(Lu, — k)v; = (Lu, + k)vo. 


Hence, 
v= | Evoaik hv. (3.20) 


This solves the scattering problem, because it gives v,; in terms of the initial 
velocity and angular momentum. However, for comparison with experiment, 
it is desirable to express this relation in terms of different parameters. 

The angle © between the initial and final velocity is called the scattering 
angle; it is defined by the equation 
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v= we, (3321) 


where 0 < © < 7a. The bivector i here is related to the direction of angular 
momentum by 


L = +i, (3.22a) 
where the positive sign refers to the attractive case (k > 0), and the negative 
sign refers to the repulsive case (k > 0). We have assumed that the angular 
momentum has opposite orientations in the 


two cases so we can use (3.21) for both 
cases, as shown in Figure 3.3a. 


Fig. 3.3a. Hodograph for hyperbolic motion. 


The magnitude of the angular momentum can be written in the form 


L=bmv, = at ; (3.22b) 


0 


where the so-called impact parameter b is the distance of the center of force 
from the asymptotes of the hyperbola, as shown in Figure 3.3b. 

Now, substituting (3.21) and (3.22) into (3.20), we get an expression for the 
scattering angle as a function of energy and impact parameter. Thus, 
_ Lu-k 2Ebi — | k | 


2 alan (3.23) 
Lu,+k 2Ebi+ | k| 


iO 


This equation can be solved for the impact parameter, with the result 


b= cot+®@. (3.24) 


We will use this relation in Section 4-7. 
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Repulsive branch k < 0 


Fig. 3.3b. The physical branch of the hyperbola depends on sign of the “coupling constant k”’. 


4-3. Exercises 


(321) Equation (3.6) shows that the orbital distance r = r(f) has either 
minimum or maximum values r, = |r, | = r(+e) when & = +2. 
The semi-major axis a is defined as half the distance between the 
points r, andr . The semi-latus rectum ¢ is defined as the orbital 
distance when r-e = 0 in the attractive case or r-€ = — 2 in the 
repulsive case. Verify the following relations: 
For an ellipse 


f 


=a(1 F €), 
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(323) 


(3.3) 


eee ea 
rt+r,’ 
ve leer 
= —— where v. = | v(+é) |. 
ee - = | v(+8) | 


These relations are of interest in astronomy, where the point of 
closest approach r., is called the pericenter, or 

the perthelion (for an orbit about the Sun), 

the perigee (for an orbit about the Earth), 

the periastron (for an orbit about a star). 
The point of greatest distance r is called the apocenter, or aphelion 
(Sun), apogee (Earth), apastron (star). 
For a hyperbola, 


2a=r-r, 


is the distance between the two branches of a hyperbola. For 
hyperbolic orbits, formulas involving both r, and r are of no 
physical interest, because motions along the two branches are not 
related. 

For both elliptical and hyperbolic orbits, the geometrical and 
physical parameters are related by 


ES 


- |= m |= a\|l-é&|, 
The turning points of an orbit are defined by the condition vr = 0). 
Show that, for a turning point, Equation (3.3) gives the relation 


r= se (e-#). 


Verify that, for both elliptic and hyperbolic orbits, this relation 
gives the points r, and r_ specified in Exercise (3.1) 

Show that the eccentricity vectors for the attractive and repulsive 
hyperbolic branches are given in terms of asymptotic initial con- 
ditions by 


2Ebi 
= (ee 1 i 
é, ( | k.| Je 


214 


Central Forces and Two-particle Systems 


(3.4) 


(3.5) 


(3.6) 


(3.7) 


(3.8) 


whence ¢ = — €, because initial velocities are opposite on the two 
branches. 

Show that orbital distance can be expressed as a function of velocity 
by 


—k 


Oi 2E —mu-v | 


For elliptical motion, compare with the orbit Equation (3.6) at 
points where v-e = 0 and vag = 0. 

Escape velocity. Estimate the minimum initial velocity required for 
an object to escape from the surface of the Earth. 

Orbital Transfer. It is desired to transfer a spaceship from an orbit 
about Earth to an orbit about Mars. Estimate the minimum launch 
velocity required to make passage on an elliptical orbit in the sun’s 
gravitational field. Neglect the gravitational fields of the planets, 
and assume that their orbits are circular. The orbit should be 
designed to take advantage of the motion of both planets. Estimate 
the time of passage, and so determine the relative position of the 
planets at launching. 

Data: 


Teath = 1 AU = 150 X 10° km = 93 X 10° miles 
Tmars = 1.5 AU 
GM. = 1:3 x 10" ms. 


Sun 
alley’s comet moves in an orbit with an eccentricity of 0.97 and a 
period of about 76 years. Determine the distance of its perihelion 
and aphelion from the Sun in units of the Earth’s radius. 
Impulsive change in orbit. An impulsive force, such as the firing of a 
rocket, will produce a change Av in the velocity of a satellite 
without a significant change in position during a short time inter- 
val. Show that, to first order, the resulting change in the eccentricity 
vector of the satellite’s orbit is given by 


de= SB y 4 © 


q 4Y, 


where AL = mra Av. Use this to determine qualitatively the 
effect of a radial impulse on a circular orbit (See Figure 3.4). Draw 
figures to show the effect of a tangential impulse. What is the effect 
of an impulse perpendicular to the orbital plane? 

A satellite circles the Earth in an orbit of radius equal to twice the 
radius of the Earth. The direction of motion is changed impulsively 
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4 
(a) (d) (c) 


Fig. 3.4. Will the radial impulse on orbit (a) produce (b) or (c)? 


through an angle 6 towards the Earth. 
Apogee Determine 6 so the orbit just skims 
the Earth’s surface. 
(3.10) Ballistic Trajectory. Neglecting air 
drag and the like, the trajectory of 
a ballistic missile is a segment of an 
ellipse beginning and terminating on 
the surface of the Earth, as in Fi- 
gure 3.5. Show that the missile’s range 
BR, is determined by the formula 


sin @ COS @ 


I 
tan > = S| _ 
xB gR,/v; — sin’ a 


where a is the firing angle measured 
from the vertical, as in the figure. De- 
termine the maximum height of the 
missile above the Earth on a given 
orbit. Determine the firing angle a, 
that gives maximum 
range for 0<BP<a2 
and given initial speed 

Ue 
(3.11) Atmospheric drag tends 
to reduce the orbit of a 
satellite to a circle, as 
shown in (Figure 3.6). 
For a rough estimate 
of this effect, suppose 
Atmosphere that the net effect of 
the atmosphere is a 
Fig. 3.6. Atmospheric drag on a satellite circularizes small impulse at perigee 
its orbit. which reduces the satellite 


Fig. 3.5. Ballistic trajectory. 


First orbit 
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velocity by a factor a. Show that the resulting change in eccentricity is 
Ae = — 2a(e + 1)8. 


For ¢ = 0.9 and a = 0.01, estimate the number of orbits required to 
reduce the orbit to a circle. Show that the speed at apogee actually 
increases with each orbit. 


4-4, The Orbit in Time 


Although we have learned how to determine elliptic and hyperbolic orbits 
from arbitrary initial conditions, the Kepler Problem will not be completely 
solved until we can describe how a particle moves along the orbit in time. In 
astronomy it is not enough to determine the size and shape of a planetary 
orbit; it is necessary to be able to locate the planet on the orbit at any 
specified time. Kepler himself solved this problem in an ingenious way, and 
we cannot do better than simplify his argument a little using the modern 
algebraic apparatus at our disposal. 


Kepler’s Equation 


In our study of the harmonic oscillator we saw that elliptical orbits can be 
described by the parametric equation 


x=acos@+ bsin @. (4.1) 
This is related to the radius vector r from a focus of the ellipse by 
r= xX-—da&=Xx-€a. (4.2) 
Hence, 
r = a(cos @- €) + 7 
+bsin@ (4.3) 


is a parametric equation r = r(@) 
for the orbit. Kepler introduced the 
angle @ into his descnption of an 
ellipse by a geometrical construc- 
tion involving an auxilliary circle, as 
shown in Figure 4.1. 

If, now, we can determine the 
parameter @ as a function of time 
¢ = p(t), then (4.3) gives us the 
desired function r = r(f) at once. 


From (4.2), we obtain Fig. 4.1. Relations among variables in Kepler’s 
problem. 


The Orbit in Time A 


L. = Mrar = MXAX — MEAAX. 


This can be integrated with respect to time using (4.1) to get a relation 
between ¢ and @. Associating the zero of time with the position at pericenter, 
we obtain 


ae |" xAdx — €ad(x — a) 
m a 


=ab@ — eab sin @. 


But in Section 4-2 we saw that 


27ab 
iE ° 


ve 
m 


where T is the orbital period. Hence, 


2nt 


an gp —esin ©. (4.4) 


This is known as Kepler’s equation for planetary motion. 


Solutions of Kepler’s Equation 


Kepler's equation gives time ¢ as a function of @, so it must be solved for @ 
as a function of ¢. Unfortunately, the solution cannot be expressed in terms of 
standard functions, so we must solve the equation by some approximation 
method. Newton devised a mechanical method which is easy to visualize. He 
noticed that Kepler’s equation can be solved by projection from a trochoid, 
the curve traced out by a point on a rolling wheel (See Figure 3-7.3b). As 
shown in Figure 4.2, one simply marks a point a distance € below the center of 
a wheel of unit radius. Then the solution is generated by rolling the wheel 
until that point has moved a horizontal distance M = 2y7t/T; the value of ¢ at 
the time ¢ 1s then the measured distance that the wheel has moved. 

Although Newton’s mechanical method enables us to visualize the solution 
to Kepler’s equation, a numerical solution to any desired accuracy is easily 
obtained with a computer. There are many ways to do this, but the simplest 
when é is small is to treat the last term in (4.4) as a perturbation. Thus, the 
zeroth order approximation to (4.4) is 


Substituting this into the “‘perturbing term” in (4.4), we get the first order 
approximation 


g=M + esinM. 
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0 @ = distance rolled 
Fig. 4.2. Mechanical solution of Kepler’s equation. 


The second order approximation is, then, 
og =M + esin(M + esin M) 
= M + esin M cos(e sin M) + € cos M sin(e sin M), 
or, since € is small, 


2 
p= M + esinM +— sin 2M. (4.5) 


This approximate solution would have been quite sufficient for Kepler’s 
observations of planetary motion. 
The angle 6 defined by 


r = ée” (4.6) 
has a more direct observational significance than the angle @, so we can make 
best use of our solution to Kepler’s equation by expressing @ in terms of 
@ = $(t) and so get 6 = Ot). To do this, we first square (4.3) and, using the 
relation b? = a’(1 — €”) derived in Exercise (2.2), we find that 

r =a(1-€cos @). (4.7) 
Using this after substituting (4.6) into (4.3) we find 

ge — £08 g-—e+ i(1-e’)'’ sind 


1-—ecos@¢ a) 


A more convenient relation between @ and ¢ can be found as follows: The 
scalar part of (4.8) is 


cos P-€ 


os 9 = ————_. 
oa 1-ecos@¢ 
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whence 


1—cos 6 - (145 1-—cos o 
1+ cos 6 l-e} 1+ cos@ ’ 


or, using the half angle formula for the tangent, 


l+e 
l-e 


V2 
tanz0= | tan +. (4.9) 


This gives 6(t) uniquely in terms of (¢), because both angles always lie in the 
same quadrant. 


4-4. Exercises 


(4.1) Show that Kepler’s equation can be put in the form 
1/2 
t | = gd-eEsin @. 


ma 


Derive the corresponding equation for hyperbolic motion (with 
Ke): 


t i i. = o-esinh ¢. 


ma 
Use the parametric equation for a hyperbola 


x = acosh ¢@ + bsinh @. 


4-5. Conservative Central Forces 


The method used in Section 4-3 to analyze the motion of a particle subject to 
an inverse square law of force will not work for arbitrary central forces owing 
to the absence of a simple constant of motion like the eccentricity vector. We 
turn, therefore, to a more general method which exploits the constants of 
motion that are available. 

We have already determined in Section 4-1] that the angular momentum 


L = mrat = mritt 1) 
is a constant of motion in a central field of force. And we know from Section 
3-10 that for a conservative force the energy 

E= ,mre+V (5.2) 


is a constant of motion. When the potential V is specified, the orbit can be 
found from the Equations (5.1) and (5.2) without referring to the equation of 
motion 
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mt = —-VV = -80,V(r) (533) 


from whence the constants of motion are derived. 

The potential of a central conservative force is called a central potential. 
The potential in (5.3) has been written in the special form V(r) = V(| r |) 
instead of the general form V(r) = V(r, f) to indicate that it must be 
independent of the directional variable r. This follows from the requirement 
that variations in V expressed by VV be in the radial direction f = Vr, 
whereas f can vary only in directions orthogonal to itself. Thus a central 
potential is necessarily spherically symmetric. 

To determine characteristics of the motion common to all central poten- 
tials, we endeavor to carry the solution of the equation of motion as far as 
possible without assuming a specific functional form for the potential V. 
Equation (5.1) suggests that we should separate the radial variable r = | r 
from the directional variable f. As we have already observed in Section 4-1, 
from (5.1) it follows that 


ei se 
mr mr 


whence 


Fr [p- E)(e+ B)-e+ 4, (5.4) 
mr mr lf 


m 


where L* =| L |? = — L’. Substituting (5.4) into the energy function (5.2), 
we obtain the radial energy equation 


2 


saamiice ry 
>mr~ + 
e 2 


me VE. (5.5) 


This is identical to the energy equation for motion of a particle along a line in 
an effective potential 


2 


a + V(r) (5.6) 


U(r) = 
with the restriction r > 0. Thus, the 3-dimensional central force problem has 


been reduced to an equivalent 1-dimensional problem. 
Putting (5.5) in the form 


p= <(E ~ U(N), _ (67) 


we see that the variables r and ¢ are separable after taking the square root, so 
we can integrate immediately to get 


eee dr 
02 [) RESO a 
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This integral is not well defined until we have specified the range of values for 
r. The range can be determined by noting that since r > > 0, Equation (5.7) 
implies that the allowed values of r are restricted by the inequality 


U(r) = 


2 
ae + VQ) <E. (5.9) 
When? = 0, the inequality reduces to an equation U(r) = E whose roots are 
maximum and minimum values of r specifying turning points (or apses) of the 
motion. A minimum allowed value r,,;, always exists, since r must be positive. 
If there is no maximum allowed value, the motion is unbounded. If there is a 
maximum value r,,,, the motion is bounded, and at the distance r,,,,. the 
particle will change from retreat to approach towards the center of force. 
Accordingly, the integral (5.8) can be taken over increasing values of r only to 
the point r,,,,, from which point it must be taken over decreasing values of r. 
Of course, the integral can be taken repeatedly over the range from r,,;, to 
reax COrresponding to repeated oscillations of the particle. The period of a 
single complete oscillation in the value of r is therefore given by 


la 


T= a) max 


r, 


min 


dr 

eS? e 9) 
This defines a period for all bounded central force motion. By considering 
diagrams for the orbits it can be seen that this definition of period must agree 
with the usual one for a particle in a Coulomb potential, but it gives only half 
the usual one for a harmonic oscillator, because it is a period of the radial 
motion rather than a period of angular motion. 

To determine the particle’s orbit, we must use (5.1) to relate changes in the 
direction fF to changes in the radial distance r. If we parametrize f as a 
function of angle 6 by writing 


if =—ee, 


then, as we have seen in Section 4-1, Equation (5.1) reduces to 


Lewanr26'. (S20) 
We can use this to make the change of variables 
(2 eh, eee 
«dé mr? dé 


which, on substitution into (5.7) gives 
dy poe 
(ap) =2me-0 7). 


Separating variables and integrating, we get an equation for the orbit in the 
form 
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r | Omalg 


(5.12) 


The integral can be evaluated in terms of known functions only for special 
potentials, such as those of the form 
k 


Va —a 


(5.13) 


where & is a constant and n is a nonzero integer. For n = —2, 1, 2 the integral 
can be evaluated in terms of inverse trigonometric functions, corresponding 
to the linear, the inverse square and the inverse cube force laws. For 
n= -6,-4,-1, 3, 4, 6 the integral can be evaluated in terms of elliptic 
functions (Appendix B). For other values of n the integral cannot be ex- 
pressed in terms of tabulated functions. 

We can ascertain the general features of the orbit without actually evalu- 
ating the integral (5.12). When integrated over a period of bounded motion, 
(5.12) gives 


Tmax 2L dr 
27+ AQ= ee : 
se e [’ em O(r))) ” C™) 


where A@ is the deviation from an angular period of 27. As shown in Figure 
5.1, A@ can be regarded as the angular displacement of the apse in one 
period. For Kepler’s elliptical or- 
bits A@ = 0, so the other central 
force orbits can be regarded as 
ellipses precessing (not necessarily 
at a constant rate) through a total 
angle A@ in one period. 
According to (5.14), the angu- 
lar precession is determined by 
integrating over any half period 
of the motion beginning at an 
apse. Indeed, from the symmetry 
of the integral we can infer that 
the orbit must be symmetric with 
respect to reflection about each 


apsidal line. So if we have a seg- Fig. 5.1. Orbital symmetry of bounded motion in 
ment of the orbit from one apse a central potential. 
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to the next, say from apse 1 to apse 2 in Figure 5.1, then we can get the next 
segment from apse 2 to apse 3 by reflecting the first segment through apsidal 
line 2. The segment after that is obtained by reflection through apsidal line 3, 
and so on. In this way, the entire orbit can be generated from a single 
segment. 

An orbit that eventually repeats itself exactly is said to be closed. In the 
present case, the condition for a closed orbit is 


Ad=2n— , (5.15) 


where n and m are integers. This is the condition that the functions r = r(t) 
and @ = @&(t) have commensurable periods. Examples of closed orbits are 
shown in Figure 5.2. 


__& . __4n 
(SG) = 3 AGO 3 


Fig. 5.2. Central force motions with three symmetry axes. 


Energy Diagrams 


As we have noted before, in atomic physics individual orbits cannot be 
observed, so it is necessary to characterize the state of motion in terms of 
conserved quantities. In particular, it is important to know how the general 
states of bounded and unbounded motion allowed by a particular potential 
depend on energy and angular momentum. This would also provide a useful 
classification of allowed orbits. The desired information is contained in the 
radial energy equation (5.5), which describes the state of motion explicitly in 
terms of L and E. The information can be put in an easily surveyable form by 
construction energy diagrams. We learn best how to do this by considering a 
specific example. 
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Let us classify the allowed motions in the potential 


ket 


V(r) = (5.16) 


where & and a are positive constants. This is called the screened Coulomb 
potential in atomic physics, where the exponential factor e”* describes a 
partial screening (or cancellation) of the nuclear Coulomb potential — k/r by a 
cloud of electrons surrounding the nucleus. A potential of the same form 
(5.16) arises also in nuclear physics, where it is called the Yukawa potential in 
honor of the Japanese Nobel Laureate who was the first to use it to describe 
nuclear interactions. 
To interpret the radial energy equation 


E=4+mr?+ U(r) (5.17) 
graphically, we need a graph of the effective potential 


k —rla ‘i 2. 
€ bi 
r 2mr 


U(r) =- 


(5.18) 


The term L */2mr? is called the centrifugal potential because of its relation to 
the centrifugal force discussed in Section 5—7. To graph the function, U(r), we 
first examine its asymptotic values. For small r << a, we have 


—rla 2 
e 1 1 L 
=— << ae so U= 5 > 0, 
r r r 2mr 
or for large r >> a, we have 
era 1 jl L? 
= — < — so U= >0 
is fe" r? 2mr? 


Thus, when L # 0 the centrifugal potential L’/2mr dominates the Yukawa 
potential in the asymptotic regions (where r is very large or very small). 

Next we determine the inflection points of the effective potential. To 
simplify the notation let us write 


s=rla (5.19a) 


and 
L’ (5.19b) 


so that 


U(r) = U(as) -£(= + 4) : 


At an inflection point we have 
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a= a a = Q 
= 0.42 
I 
| 
/ ! 
: I 
Unphysical Ji 
region s < 0) he | 
f ! a> ay 
C4 ! 
ye 
oe ] 
I 
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{ 


INA 0 Sy) = 1.62 s=r/a 


Fig. 5.3. Graphical solutions of a transcendental equation. 


aux {14 | er- 28 =o, 
s S s 
or 
Sie) s 
— ae (5.20) 


This is a transcendental equation for s as a function of a. Its solutions are 
points of intersection of the exponential curve y = e* with the parabola 
y = s(s + 1)/2a. As graphs of the functions in Figure 5.3 show, there are 
three distinct possibilities. (a) For a particular constant a,, the parabola 
intersects the exponential curve at exactly one point s, > 0, a point of 
tangency. In this case, then, U(r) has a single inflection point. (b) Fora > a, 
the curves do not intersect for positive s, in which case U(r) has no inflection 
points. (c) For a < a, the curves intersect and they must intersect exactly 
twice for s > 0, because e* increases with s faster than any power of s. 
Therefore, U(s) has two inflection points in this case. 

The constants s, and a, are determined by the requirement that the 
exponential curve be tangent to the parabola, so by differentiating (5.20) we 
obtain 
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re ae 
2, , 

Equating this to (5.20) we obtain 
si-s,-1=0, 


which has the single positive solution 


So =r/a=4+(1+ V5 ) = 1.62. (5:12a) 
Furthermore, 
Oy = (S) + + )e* = 0.42. (5.21b) 


Note that the values of these constants are independent of the values for the 
physical constants in the problem. It is amusing to note also that the number 
+(1 + V'5_) is the famous golden ratio, to which the ancient Greeks attrib- 
uted a mystical significance. The golden ratio has many remarkable math- 
ematical properties (among which might be numbered its appearance in this 
problem), and it appears repeatedly in art and science in a variety of peculiar 
ways. The Greeks, for example, believed that a perfect rectangle is one whose 
sides are in proportion to the golden ratio, and this ratio runs throughout 
their art and architecture, including the Parthenon. 

Graphs of the effective potential for the three possible cases are shown in 
Figure 5.4. The case with two inflection points is most interesting, so let us 
examine it in detail. A typical graph is shown in Figure 5.5. Allowed orbits 
are represented in the figure as lines of constant energy in regions where they 
pass above the effective potential. Characteristics of the various allowed 
states of motion can be read off the figure. Thus 


Fig. 5.4. The effective potential as a function of angular momentum L. 
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Fig. 5.5. Energy diagram for motion in a screened Coulomb potential. 


(a) A particle with energy E, will oscillate between the turning points r, 
and r, in Figure 5.5. According to the radial energy equation (5.14), the 
radial kinetic energy at any point is given by £, — U(r), with a maximum 
value at r,. 

(b) A particle with energy E, is allowed only at the radius r, where it has 
zero radial kinetic energy, so this state corresponds to a circular orbit. If the 
particle is given a small impetus raising its radial kinetic energy and its total 
energy from E, to E,, say, then the radius of its orbit will oscillate without 
deviating far from r,. For this reason we say that the circular orbit is stable. 
(c) A particle with energy E, will be bound if its initial radius is less than r, 
and free (in an unbounded orbit) if its initial radius is greater than r,. Note 
that two possibilities like this can occur only for positive energies. They 
cannot occur in a Coulomb potential for which, as we have shown in 
Section 4-3, all bound states have negative energies. Bound states with 
positive energy have a special significance in quantum theory. In quantum 


228 Central Forces and Two-particle Systems 


theory the energy of a particle is subject to fluctuations of short duration, so 

a particle with energy E, has a finite probability of temporarily increasing 

its energy to more than E,, enabling it to ‘‘jump” the potential barrier from 

r, to r, and so pass from a bound to a free state with no net change in its 

energy. Thus, in quantum theory, bound states with positive energy have a 

finite lifetime. 

(d) No particle with energy greater than E, in Figure 5.5 has bound states 

of motion. If a particle with energy E,, for instance, has an initial negative 

radial speed, it will approach the origin until it ‘‘collides with the potential 
wall” at r,, after which it will retreat to infinity. 

To summarize, Figure 5.5 shows an effective potential with a potential well 
of depth E,. The bound states are composed of particles ‘‘trapped in the well” 
with energy less than E,. All other states are unbound. 

Returning now to Figure 5.4 and recalling (5.19b), we see that there is a 
critical value of the angular momentum 


L, = (2mkaa,)'” (5:22) 


For L < L,, the effective potential has a dip, allowing bound states. For 
L > L,, the effective potential has no dip so-there can be no bound states. 
For L = L, there is a single bound state, a circular orbit of radius r, = 1.62a; 
however, this orbit is unstable, because a small increase in energy will free the 
particle completely. 


Stability of Circular Orbits 


Every central potential admits circular orbits for certain values of the energy 
and angular momentum. We have seen, however, that stability of a circular 
orbit against small disturbances depends on the curvature of the effective 
potential. This insight enables us to ascertain the stability of circular orbits in 
any given central potential with ease. The question of stability is more than 
academic, for the only circular orbits we can hope to observe in nature are 
stable ones, and only motions along stable orbits can be controlled in the 
laboratory. 

Let us investigate the stability of circutar orbits in the important class of 


attractive potentials with the form V = - k/r" . The effective potential is 
(= (5.23) 
r” <2yar? 


Now, for a circular orbit the radius r must be constant so r = Q, and it must 


stay that way so ¢ = 0. Differentiating the radial energy equation (5.17), we 
find 


mr# = -r d,U, 


OT 
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mi = —0,U. (5.24) 


Hence, the radii of the possible circular orbits are the inflection points of the 
effective potential, determined by the condition 4,U = 0. In Figure 5.5, for 
example, the inflection point r, is the radius of a circular orbit with energy E,, 
although the orbit is obviously unstable. Applying the condition for circular 
orbits to (5.23) we obtain 


nk / gs 
RIC! = TES — 5 = 
r mr 
Hence, 
n2 nen (5.25) 


The orbit will be stable if this inflection point is a minimum and unstable if it is 
a maximum. Differentiating the effective potential once more, we obtain 
n(n+1)k a ae 


2 a= 
0,U= — a3 
mr 


pnt2 


and using (5.25) we find that 


/i2 
ae (2—n) (5.26) 


a;U = 


at the inflection point. From (5.26) we can conclude that the attractive 
potential — k/r” admits stable circular orbits ifn < 2 but not ifn > 2. The case 
n = 2 requires further examination (See Exercise 5.3). 


4-5. Exercises 


Central force problems are also found in Section 4-2. 
(5.1) Show that a particle with nonzero angular momentum in a central 
potential can fall to the origin only if 


ae sat , where n2=2. 
r= r 


(2) For the cases n = —2, 1, 2, integrate (5.12) with the potential (5.13) 
and invert the result to get an equation for the orbit in the form 
r = r(0). Compare with previously obtained orbits for these cases. 
Determine the precession angle A@ for each case. 


(5.3) Use energy diagrams to classify orbits in the attractive central 
potential —k/r*. Can circular orbits be stable in this potential? 
(5.4) Determine the necessary conditions for stable circular orbits in the 


potential 
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—k k’ 
Vir) = ee ee 


4-6. Two-Particle Systems 


Consider a system of two particles with masses m,, m, and positions x,, x, 
respectively. The equations of motion for the particles can be written in the 
form 


mx, = fi as F, ’ (6.1a) 
mX, =f, + F), (6.1b) 


where f,, is the force exerted on particle 1 by particle 2, and F, is the force 
exerted on particle 1 by agents eternal to the system, with a similar descrip- 
tion of the forces on particle 2. To write (6.1a, b) we have appealed to the 
superposition principle to separate the internal forces exerted by the particles 
on one another from the external forces F, and F,. 

Our aim now is to describe the system by distinguishing the external motion 
of the system as a whole from the internal motions of its parts. This aim is 
greatly facilitated by Newton’s third law, which holds that the mutual forces 
of two particles on one another are equal and opposite, that is, 


f,. =f... (6.2) 


We shall see later that Newton’s third law is not universally true in this form. 
Nevertheless, it is usually an excellent approximation, and we can easily tell if 
it is violated when the form of the force f,, is specified, so we sacrifice little by 
adopting it. 

Notice now that, because of (6.2), the internal forces cancel when we add 
the two Equations (6.1a, b). The result can be written in the form 


Mea b= b, (6.3) 
where 
M=m,+m, (6.4) 


is said to be the mass of the system, and 


LD. Flee LE 
m, + m, 


X 


(6.5) 


is called its center of mass. 

Equation (6.3) can be regarded as an equation of motion for the system as a 
whole. It is like the equation of motion for a single particle with mass M and 
position X, except that the total external force F = F, + F, generally depends 
on the structure of the system (i.e. the positions and velocities of its parts). 
Among the few structurally independent external forces we have the follow- 
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ing: If the system is immersed in a uniform force field, then the total external 
force is obviously constant. Thus, for a constant gravitational field, we have 


Fray a me AF m8 a Mg. (6.6) 


Or, if the system is subject to a linear resistive force proportional to the mass, 
then the external force is 


le y(m,x, ae m,X,) = yMX. (6.7) 


Finally, for particles with a constant charge to mass ratio a@ = q,/m, = q,/m, 
in a uniform magnetic field, the external force is 


: cae 
Frog = “2k, XB + 2x, x B= (mx, + mk) X B 


Sy ie Xx R, (6.8) 
C c 
where Q = q, + q, is the total charge of the system. In general, the total 
force on a system is independent of structure only when it can be expressed as 
a function of the form F = F(X, X, 2), as in the examples just mentioned. 
The internal structure can be described in terms of the particle positions r, 
and r, with respect to the center of mass; they are given by 


r,=x,-X= “tr, ~ (6.9a) 
=-—r, (6.9b) 
r=x,-xX,=r,-r,, (6.10) 


is the relative position of the particles with respect 
to one another (see Figure 6.1). Differentiating 
(6.9a, b), we have 


r, = i,-X =H, (6.11a) 
r=x,-X= Seca: (6.11b) 
and 
r, ae X; a x = = Fr, 
Fig. 6.1. Variables for a t= %- eo See 


two-particle system. M 
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Using these equations to eliminate x, and x, after we subtract (6.1b) from 
(6.1a), we get 


Ea) i + (m,— m,)XK = 2f,, + F,—F,. 


and using (6.3) to eliminate X, we obtain 


ur = f,, + sie ee (6.12) 


where the quantity 


mm, a mm, 
M my +m, 


w= (6.13) 
is called the reduced mass. 

Equation (6.12) describes the internal motion of the two particle system. 
The effect of external forces on the internal motion is determined by the term 
(m,F, — m,F,)/M. Note that this term vanishes for the external gravitational 
force (6.6) and it is a function of r only for the forces (6.7) and (6.8). For such 
external forces, therefore, if the internal force f,, depends only on the relative 
motions of the particles, then (6.12) is an equation of the form 


pr = f(r, r). (6.14) 


But this is the equation of motion for a single particle of mass subject to the 
force f(r, r). In this way, the problem of solving the coupled equations of 
motion for a two-particle system can often be reduced to an equivalent pair of 
one-particle problems with equations of motion (6.3) and (6.14). 

External forces often have a relatively small effect on the internal motion, 
even when the force function cannot be put in the functional form of (6.14). 
In such cases, the external forces can be handled by perturbation theory, as 
will be demonstrated in detail in Chapter 8. Before taking external forces into 
account, we should analyze the effect of internal forces acting alone. 


Isolated Systems 


A system of particles subject to a negligible external force is said to be 
isolated. From (6.3) it follows that the center of mass of an isolated two- 
particle system moves with constant velocity X. The only problem then is to 
solve the equations of motion for the internal motion. Evidently for an 
isolated pair of particles, the relative position r = x, — x, and the relative 
velocity r are the only relevant kinematical variables if the particles are 
structureless objects themselves. So we expect that an internal equation of 
motion of the general form (6.14) will be valid quite generally. Certainly the 
force law will be of the form f(r, r) if each particle is the center of force for the 
force it exerts on the other particle. Our previous analysis of central force 
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motion for a single particle can therefore be applied immediately to the more 
realistic case of two interacting particles. The only mathematical difference in 
the two cases is that, as (6.14) shows, the reduced mass u must be used in the 
equation of motion instead of the mass of a single particle. If the mass of one 
particle, say m,, is negligible compared to m,, then according to (6.13), 
ft = m, and (6.14) reduces to a single particle equation. 


Two-Body Effects on the Kepler Problem 


Let us see now how our previous analysis of motion under a gravitational or 
Coulomb force must be modified to take into account the ‘‘two-body effects”’ 
arising from the finite mass of both particles. As before, the force law is 


k(x, — x,) kr kt 
fi =>, ee 5 
| x -xX, |? r? r2 ” (6 15) 
so (6.14) takes the specific form 
Pfs (6.16) 


r 


Obviously, Kepler’s first two laws still follow from (6.16), but his third law 
must be modified, because it depends on value of the mass wu. 

From (2.10) and (2.11), we see that Kepler’s third law should be modified 
to read 


(6.17) 


For planetary motion, the mass of a planet is too small compared to the mass 
of the Sun to produce an observable deviation from Kepler's third law, except 
in the case of Jupiter, where Myupner = (0.001)... For binary stars, Equation 
(6.17) is used to deduce the masses from observations of periods. 

Newton was the first to derive (6.17) and he used it to estimate the 
moon-earth mass ratio. In terms of quantities readily measured in Newton's 
day, Equation (6.17) gives the mass ratio 

m, _ 47a’ 


a ORT” 1, (6.18) 
where R, is the Earth’s radius and g is the gravitational acceleration at the 
surface of the Earth. Even with the best data available today Equation (6.18) 
gives a Moon-Earth mass ratio with no better than 30% accuracy. The main 
reason for this is neglect of the effect of the Sun. The force of the Sun on the 
Moon is in fact more than twice as great as the force of the Earth on the 
moon. Even so, we have seen that if the Sun’s gravitational field were uniform 
over the dimensions of the Earth-Moon system, it would have no effect on 
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the internal motion of the system, and (6.17) would be valid. The variations 
of the Sun’s force on the moon are responsible for deviations from (6.17). 
They have a maximum value of about one hundredth the force of the Earth 
on the Moon. Newton realized all this, so he embarked on a long program of 
analyzing the dynamics of the Earth-Moon system in ever increasing detail, a 
program that has continued into this century. However, we shall see below 
that a fairly good estimate of the Moon-Earth mass ratio can be achieved 
without calculating the effect of the Sun. 

Returning now to the interpretation of solutions to Equation (6.16), we 
know that for a bound system the vector r, representing the relative separ- 
ation of the particles, traverses an ellipse. However, this is not a complete 
description of the system’s internal motion, for both particles move relative to 
the center of mass. The complete internal motion is easily ascertained from 
(6.9a, b), which gives the internal particle positions r, and r, directly from r. 
From these equations we can conclude immediately that both particles move 
on ellipses with focus at the center of mass and eccentricity vectors ¢€, = 
—&, = &, where é€ is eccentricity vector for the r-orbit; the orbits differ in 
scale, and both particles move so as to remain in opposition relative to the 
center of mass, more specifically, from (6.9a, b) we have 


mY, =—m,r, = pr, (6.19) 


for all particle positions on the orbits. These relations are shown in Figure 
Goa: 


Fig. 6.2. Two-particle Kepler motion for m, = 2m,. 


Internal motion relative to the center of mass is observable only by 
reference to some external object. For example, internal motion of the 
Earth-Moon gives rise to a small oscillation in the apparent direction of the 
sun, as indicated in Figure 6.3. This gives us another method of determining 
the Earth-Moon mass ratio. Observations give a value of 6.5” for the angle a 
in Figure 6.3, and R = 1.5 X 10° km for the distance to the Sun. From this we 
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Moon 


a _* Sun 


Fig. 6.3. Internal motion of the Earth-Moon system (not to scale). 


obtain r, = aR = 4.7 X 10° km for the distance of the Earth’s center from 
the center of mass X. It will be noted that X lies well within the Earth’s 
surface. Now, using the value r = 3.8 X 10° for the average Earth-Moon 
distance, from (6.19) we obtain the Earth-Moon mass ratio 

ine 
- = 81, (6.20) 
close to the accepted value of 81.25 obtained by more refined methods. The 
result obtained by this method is evidently so much better than the result 
obtained from (6.18), because it involves only a lateral cross-section of the 
Earth—Moon orbits along which the Sun’s field is nearly uniform, whereas the 
period T in (6.18) depends on variations of the Sun’s gravitational field over 
the entire orbit. The discrepancy between these two results indicates that the 
Sun has a measurable effect on the Moon’s period and challenges us to account 
for it. A method for solving this problem will be developed in chapter 8. 


4-6. Exercises 


(6.1) For internal motion governed by Equation (6.16) show that the 
energies and angular momenta of the two particles are related by 


m 
E,=™E, L,=““1L, 
mM, m, 
while the total internal energy E = E, + £, and angular momen- 
tum L = L, + L, can be attributed to an equivalent single particle 
of mass w. 
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(Ge) Show that two particle elliptical orbits intersect, as in Figure 6.2, 
when 
Pa fe) eis a 
m, +m, 


4-7. Elastic Collisions 


In Section 4-2 we saw how the gravitational force law could be ascertained 
from experimental information about planetary orbits. For atomic particles, 
however, it is quite impossible to observe the orbits directly, so we must 
resort to more indirect methods to ascertain the atomic force laws. Exper- 
imental information about atomic (and nuclear) forces is gained by scattering 
experiments in which a binary (two particle) collisions are arranged. Measure- 
ments on the particles can be made only in the asymptotic regions before and 
after collision, where the interparticle interaction is negligible. The problem 
is to determine the forces which will produce the observed relations between 
the initial and final states. To approach this problem systematically, we first 
determine the consequences of the most general conservation laws for un- 
bounded motion of two-particle systems. Then (in Section 4-8) we investigate 
the consequence of specific laws, like the Coulomb force, which are believed 
to describe atomic forces. 


Conserved Quantities 


We are concerned here with unbounded motion of an isolated two-particle 
system. The general equations we need were developed in the last section. 
For an isolated system, Equation (6.3) tells us that the center of mass velocity 
X is constant, so according to (6.3), 


MX = mx, + mx, (7.1) 


is a constant of the motion for a 2-particle system. The product of a particle 
mass with its velocity is called its momentum. Thus, the vector m,x, is the 
momentum of particle 1. Let p, and p’, denote initial and final momenta of 
particle 1, that is, the asymptotic value of m,x, before and after collision. 
With a similar notation for the momentum of particle 2, we obtain from (7.1) 
the law of momentum conservation 


p,+ p, = MX=p,+p.. (7.2) 

The additivity of momenta in this conservation law shows that momentum is 
an important physical concept. 

Momentum conservation is the most general principle governing collisions, 

because it is independent of the forces involved. Next in generality we have 
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energy conservation, which holds if the interaction force is conservative. To 
determine its consequences, we introduce the internal or center of mass 
momentum pr, which, according to (6.11la, b) and (6.13), is related to the 
external variables by 


ire X)i ee, — X), (i753) 
If p and p’ are the initial and final values of ur, then 

p =p, —m,X == (p,-7,X), (7.4a) 

p’ = pi/-m,X = —(pi-m.X). (7.4b) 


Now, if the internal force is conservative, then the internal energy is a 
constant of the motion and equal to the value of the internal kinetic energy 
yur’ in the asymptotic region. Thus, energy conservation is expressed by 


ee 
Qu 2u P 
or simply 
P= p> (7.5) 


This can be related to external variables, by using the relation 
tm, xX + $+m,x%2 = +MX? + +ue?, (7.6) 


which follows from (7.3) and holds whether energy is conserved or not. 
Evaluating (7.6) in the asymptotic regions and using (7.5), we obtain energy 
conservation in the form 


Pi P; Pi P; 
eee ay 
2m, 2m, 2m, 2m, i?) 


Of course, it is easier to apply (7.5) than (7.7). 

A collision is said to be elastic if it conserves energy and the masses of the 
particles involved. The above equations apply to any binary elastic collision. 
It follows from (7.5) that an elastic collision has the effect of simply rotating 
the initial CM momentum p through some angle © into its final value p’, as 
described by the equation 


p’ = pe’®, (7.8) 
where the unit bivector i specifies the scattering plane. The angle @ is called 
the CM scattering angle. The internal (CM) velocities of the colliding particles 
are r, = x, - X andr, = x, — X, which are always oppositely directed accord- 
ing (7.3), so the relation between initial and final states can be represented as 
in Figure 7.1. 

If the interparticle force is central as well as conservative, then angular 
momentum L is conserved. This, in turn, implies that the direction L and 
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magnitude |L| are sep- 
arately conserved. Conser- 
vation of L implies that the 
orbits of both particles as 
well as the center of mass 
lie in a single plane, so L 
can be identified with the i 
in (7.8). This condition is 
usually taken for granted in 
scattering experiments. 
Conservation of | L | does 
not supply helpful relations 
between initial and final 
states, because it pertains 
to details of the orbits ; 
which are not observed in mr, = —p’ 
scattering. However, the in- 

itial value of | L | must be 

given to determine the scat- Fig. 7.1. Center of mass variables. 
tering angle © from a given 

interparticle force law. 


Momentum and Energy Transfer 


Note that all the above equations for elastic collisions hold for any value 
givento the center of mass velocity X. From (7.4a, b) we see that the 
momentum transfer Ap defined by 


Ap = p'-p = Pi-P, =~ (Pi-P2), (7.9) 
is independent of X. However, the energy transfer AE defined by 


AE = > (iw) = 5-9) (7.10) 


depends on the value of X, for, with the help of (7.4a, b), we can express it in 
the form 


AE = -X-Ap. (7.11) 


According to (7.10), AE is positive if particle 1 loses energy in the collision 
and negative if particle 1 gains energy. 

Equation (7.11) has important applications in astromechanics as well as 
atomic physics. A spacecraft travelling from earth to the outer planets 
Uranus, Neptune and Pluto can be given a large boost in velocity by scatter- 
ing it off Jupiter. According to (7.11), the boost can be maximized by 
maximizing X-Ap = X-(p’ — p). For the Jupiter-spacecraft system, we can 
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identify X with the vel- 
ocity of Jupiter relative to 
/ the Sun. The initial 
spacecraft | momentum 
P=p,-—™,X is deter- 
mined by the launch of 
the spacecraft from 
Earth. With p fixed, 
therefore, the maximum 
boost is achieved by ad- 
justing the impact para- 
meter for the collision so 
that p’ is parallel to X. 
This can be arranged by 
appropriate timing of the 
launch and manuevering 
of the spacecraft. A 
“gravity-assist” trajectory 
Fig. 7.2. Trajectory from Earth to Uranus with ‘“‘gravity- from Earth to Uranus is 
assist” by Jupiter. shown in Figure 7.2. The 

transit time from Earth to 
Uranus is about 5 years on the assisted orbit as compared with 16 years on an 
unassisted orbit with the same initial conditions. (Exercise (7.6) ). 


Uranus encounter 


Jupiter encounter 


LAB Scattering Variables for Elastic Collisions 


In a typical scattering experiment with atomic particles, one particle with 
initial “LAB momentum” p, is fired at a target particle at rest in the 
laboratory with initial momen- 
tum p,=0. Scattering vari- 
ables which can be measured 


initial fairly directly in the laboratory 
state are indicated in Figure 7.3 The 
|e angle @ is called the LAB scat- 
pe 17,X, tering angle, and ¢ is called the 


recoil angle. With the “LAB 
condition” p, = 0, Equation 
(7.4a) gives the relation be- 
tween LAB and CM momenta 
before collision 


M 


mM, 


Fig. 7.3. LAB variables. 


Pp, = —p=Mx. (7.12a) 


Using this to eliminate X from (7.4b), we can solve for the final LAB 
momenta in terms of CM momenta, with the result 
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pi=—'pt+p’, (7.12b) 


PPP = — op. (7.12c) 


These three equations (7.12a, b, c) describe all the relations between LAB 
and CM variables. These relations along with the conservation laws (7.2) and 
(7.5) represented in Figure 7.4. 

The laboratory energy 
transfer AE and the angle of 
deflection 6 are of direct in- 
terest in scattering, so let us 
express them in terms of CM 
variables and see what that 
tells us. The total energy E, 
of the 2-particle system is, 
by (7.7), equal to the initial 
kinetic energy of the projec- 
tile, which, by (7.12a), can 
be expressed in terms of the 
CM momentum; thus, 


Pi 
E.= 
0 2m, 


Pie! 
aaeBS= (7,13) 


Pain Fig. 7.4. Elastic scattering variables (m, < m,). 


In the collision this energy will be redistributed among the particles. Since the 
target particle is at rest initially, according to (7.10) the energy transfer AE is 
equal to its kinetic energy after collision; by (7.12c), then, 


he (p’ - p) ‘ 


oF a (7.14) 
The fractional energy transfer is therefore 
AE _ wp'-py _ 4mm, ear 
E, Mp (tmp on = 


where we have used (7.5) and (7.8) to get 
(p’ — p) = 2p°(1—cos ©) = 4p’ sin? +O. (7.16) 


Note that (7.15) would be unaffected by a change in the center of mass 
velocity for the two-particle system, so it must be applicable to moving targets 
as well as the stationary targets we are considering. 

Some important conclusions can be drawn from (7.15). The energy transfer 
has its maximum value when © = gz, so 
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| AE | _ 4mm, 
E, }max — (m, + m,/’ 


This tells us that all of the energy can be transferred to the target only if 
m, = m,. For this reason, since the mass of hydrogen is nearly the same as 
the mass of the neutron, hydrogen-rch materials are more effective for slowing 
down neutrons than heavy materials like lead. Also, when electrons pass 
through a material they lose most of their energy to other electrons rather than 
atomic nuclei. To see how little energy electrons lose to nuclei, we need only 
know that the proton-electron mass ratio is 1836, so (AE/E,)n,. = 
4/1836 = 0.2%. A nucleus hardly budges when an electron bounces off it, 
just as bowling ball will hardly be budged by collision with a ping-pong ball. 

To relate the LAB scattering angle 6 to the CM scattering angle ©, we note 
that 6 can be defined algebraically by 


p! = pe”. (7.18) 


According to (7.12a), p, = p, so if we multiply (7.12b) by p and introduce the 
scattering angles by (7.8) and (7.18), we obtain, 


=i (7.17) 


pe = | m+ e| (7.19) 


m, 
We can eliminate p and p; from this relation by observing that it imphes 


' 2 2 
(ZL) = (+ e0]( Meee) = 14 (= ] + Hrs, 
P m, 2 m 


which, on substitution back into (7.19), gives 


es m,/m, + e© 
~ [1 + (m/m,) + 2(m,/m,) cos @]!? ° 


om (7.20) 
A somewhat simpler relation between scattering angles can be obtained by 
taking the ratio of bivector to scalar parts of (7.19) or (7.20) to get 
tan 0 = =e . (7.21) 
m, + cos © 


To interpret this formula for m, < m,, refer to Figure 7.4. For fixed initial 
CM momentum p, the final momentum p’ must lie on a sphere of radius | p |. 
It is clear from Figure 7.4 that there is a unique value of 6 for every value of © 
in the range 0 < © < a, which covers all possibilities. Indeed, for the limiting 
case of a stationary target (m, << m,), Equation (7.21) reduces to tan 
© =~ tan 9, whence 6 = ©. However, in the equal mass case m, = m,, the 
origin 0 in Figure 7.4 lies on the circle, and (7.21) reduces to tan 60 = tan}, 
whence 6 = +O. 

For a light target (m, > m,), the origin 0 lies outside the circle, as shown in 
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Figure 7.5. In this case, there are two values say ©, and ©,, of the CM 
scattering angle for each value of the LAB scattering angle. The two values 
©, and ©, can be distinguished in the lab by measuring the kinetic energy of 


the scattered particle. 
The LAB scattering 
angle has a maximum 
value @,,.x given by 


sin 60x = 
mM, 
— , (7.22 
— (7.22) 


which can be read di- 
rectly off Figure 7.5. 
From this we can de- 
duce, for example, that 
a proton cannot be scat- 
tered by more than 


0.03° by an electron. Fig. 7.5. Range of scattering angles for m, > m,. 
Mi 


Therefore, any signifi- 


cant deflection of protons or heavier atomic nuclei passing through matter is due 
to collisions with nuclei rather than electrons. 


4-7. Exercises 


(7.1) 
(7.2) 


(3) 


(7.4) 


(7) 


(7.6) 


Establish Equations (7.6) and (7.11) 

Prove that for elastic scattering of equal mass particles the sum of 

the scattering angle and recoil angle is always 90°. 

Alpha particles (i.e. Helium nuclei ;He) are scattered elastically 

from protons at rest. Show that for a scattered particle the maxi- 

mum angle of deflection is 14.5°, and the maximum fractional 

energy loss is 64%. 

A particle of mass m, collides elastically with a particle of mass m, at 

rest. Determine the mass ratio m,/m, from the scattering angle 0 

and the recoil angle @. 

An unstable particle of mass m = m, + m, decays into particles 

with masses m, and m,, releasing energy Q to products. 

(a) Determine the CM kinetic energies of the two particles produced. 

(b) If the unstable particle has an initial kinetic energy K, deter- 
mine the maximum and minimum kinetic energies of the prod- 
ucts. 

Evaluate the advantage of gravitational assist by Jupiter for a 
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mission from Earth to Uranus (Figure 7.2) as follows: 

(a) Calculate the time of passage Az, from Earth to Uranus on an 
orbit of minimum energy. Determine the speed v of the space- 
craft as it crosses Jupiter’s orbit, and the angle of intersection a 
between the two orbits. (Data on the planetary orbits are given 
in Appendix C). 

(b) Suppose that the launch is arranged so the spacecraft encoun- 
ters Jupiter on the orbit specified by (a). Calculate the maxi- 
mum speed Av that the satellite can gain from scattering off 
Jupiter, and the corresponding scattering angle @ in the rest 
system of Jupiter. 

(c) Determine the distance d of closest approach to the surface of 
Jupiter for maximum speed gain. (The radius of Jupiter is 
R, = 71 400 km). 

(d) Determine the eccentricity e¢ of the spacecraft’s orbit in the 
heliocentric system after escape from Jupiter’s influence. Cal- 
culate the transit time At, from Jupiter to Uranus on this orbit 
by evaluating the integral 


el | r a. 
ty ic 0 


The upper limit 6, can be determined from the orbit Equation 
(3.6a). Similarly, calculate the transit time At, from Earth to 
Jupiter to get the total transit time for the mission to Uranus. 
Of course, these estimates are only approximate, since the 
influences of Jupiter and the Sun were evaluated separately. 


4-8. Scattering Cross Sections 


In a typical scattering experiment an incident beam of monoenergetic par- 
ticles is directed at a small sample containing the target particles, and the 
scattered particles are collected in detectors, as shown in Figure 8.1. Even 
solid material is mostly “empty space” at the atomic level, so it is not difficult 
to prepare a sample thin enough so that multiple collisions with incident 
particles are negligibly rare compared to single collisions. Consequently, we 
can restrict our attention to the scattering of the beam by a single target in the 
sample. 

All the particles in the incident beam have the same energy E, and direction 
of motion. The beam has a uniform cross section with an intensity N, defined 
as the number of particles per unit area per unit time incident on the sample. 
Let N be the number of incident particles scattered by a single target particle 
in a unit time. The total cross section o is defined by 
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detector 
ve 


Fig. 8.1. Arrangement for a scattering experiment. 


N 


= Me 


(8.1) 


It has the dimensions of area. It can be interpreted as the area of an imaginary 
disk in the asymptotic region transverse to the beam and centered on a line 
through the center of the target, so that only the incident particles which pass 
through the disk are scattered. Now, consider an annulus on this disk of 
radius b and width db. All incident particles intercepting this annulus have the 
same impact parameter b, so, for a central force, they will all be scattered by 
the same angle 0, as shown in Figure 8.2. Particles intercepting a segment of 
the annulus with area do = b db d@ will be scattered into a segment on a 
unit sphere centered at the target with the dw = sin 6 d6 d@ called the solid 
angle. The quantity 


do b ES 1 db? 


do do} 2 d(cos 6) 2) 


dw sné@ 
is called the (differential) scattering cross section. Since all scattered particles 


must pass through the sphere somewhere, the total cross section will be 
obtained by integrating the cross section over the unit sphere, that is, 


-4 (22 \ao = x [Po ab. (8.3) 


The rate at which particles are ‘‘scattered into dw”’ is 
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Fig. 8.2. Central force scattering for a given impact parameter or scattering angle. 


dN _ do 


dw ° daw > (8.4) 


which gives (8.1) when substituted into (8.3). Thus, the differential cross 
section can be measured quite directly simply by counting particles scattered 
through each angle, as indicated in Figure 8.1. 

Given the force law which determines the scattering, we can deduce the 
deflection function 6 = b(E,,, 6) for the impact parameter as a function of 
initial energy and scattering angle. Then we compute db/d@ and obtain do/dw 
from (8.2). (Note that db/d@ is negative, because, as Figure 8.2 shows, an 
increase in scattering angle corresponds to a decrease in impact parameter; 
for this reason the absolute value | db/d6@ | has been used in Equation (8.2).) 
It is much easier to predict do/dw from a given force law than it is to 
determine a force law from the observed values of do/dw. So, when the force 
law is unknown, the usual approach is to guess at the form of the force law, 
compare the predicted do/dw with experimental data, and then look for 
simple modifications of the force law which will account for the discrepancies. 
Of course some discrepancies are not due to the force law at all, but to other 
effects such as multiple scattering. The theory of atomic scattering has 
reached a high level of sophistication, and physicists are able to distinguish a 
wide variety of subtle effects. Although the classical theory of interactions 
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which we have been studying must be modified to account for quantum effects 
at the atomic level, we must first know the consequences of the classical 
theory before we can understand the modifications of quantum theory. 
Moreover, the consequences of classical and quantum theories are practically 
equivalent in many situations. Therefore, it is well worthwhile to continue 
applying classical concepts to the analysis of atomic phenomena. 

We have reduced the problem of analyzing a scattering experiment to the 
determination of the deflection function b = b(E,, 0) for the impact par- 
ameter as a function of angle. Let us carry out the calculation of the deflection 
function and scattering cross section for some important force laws. 


Hard Sphere Scattering 


Let us first calculate the deflection function and cross section for the simplest 
kind of scattering, the scattering of particles by a stationary hard-sphere. 
Later we shall see that the general problem of hard-sphere scattering can be 
solved by reducing it to this one. As shown in Figure 8.3, an incident particle 
is scattered impulsively on contact with the surface of the sphere. Let p and p’ 
respectively be the initial and final momentum of the scattered particle. 
Energy conservation implies that 


|p| = |p’. (8.5) 


Now, the force of interac- 
tion is central, though dis- 
continuous at the surface of 
the sphere. Consequently, 
angular momentum is con- 
served in the collision, and 


Rap=Rap’, (8.6) 


where R is the radius vector 
to the point where the par- 
ticle makes contact with the 
sphere. The left side of (8.6) 
is the angular momentum 
immediately before  colli- 
sion, while the right side is 
the angular momentum im- 
mediately after. From (8.5) Fig. 8.3. 
and (8.6) we conclude that 

sin a = sin a’ and 


a=a', (8.7) 


that is, the angle of incidence a@ equals the angle of reflection a’. 


Scattering by a stationary hard sphere. 
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As Figure 8.3 shows, the scattering angle is given by 0 = 1-a-a' = m- 2a, 
and the deflection function is given by 


b= Rsina= Rsin( 2-9) = Roos $. (8.8) 
Therefore, db/d@ = —+R sin +8, and (8.2) gives 

do —_ lLp2 

= ER’. (8.9) 


Thus, the differential cross section is isotropic, which is to say that particles 
are scattered at the same rate in all directions. 
Substituting (8.9) into (8.3), we get 


O= ER dda = xR. (8.10) 
Thus, the total cross-section is exactly equal to the cross-sectional area of the 


sphere, as we expect from geometrical considerations. 


Coulomb Scattering 


We derived the deflection function for scattering by a Coulomb force in 
Section 4-3. According to (3.24) the Coulomb deflection function is 


b=acot>6, (8.11) 
where 
; & | 4192 | 

a OE. (8.12) 


for interacting particles with charges qg, and q,. Now, the derivative of (8.11) 


db ( 1 | 
ee 
dé sin? + @ 


So, according to (8.2), 


do ___@ cot +0 

dw 2sin @sin? +0 
But, sin 8 = 2 sin +0 cos + 0. Hence, 

do a’ 1 

So 8.13 

dw 4 sin’> @ ( ) 
This formula is the justly famous Rutherford Scattering Cross-section. Ernest 
Rutherford derived it in 1911 and showed that it accurately described the 
angular distribution of a@ particles (He nuclei) scattered from heavy nuclei in 
experiments by Geiger and Marsden. The sin “ + 6 dependence was verified 
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over a range of angles on which do/dw varied by a factor of 250 000. The 
energy dependent factor a was varied by a factor of 10. In particular, the 
value do/dw = a*/4 agreed well with the experimental counts for backscatter- 
ing (when 6 ~ z). Backscattering can occur only for a head on collision, and 
the parameter a is the distance of closest approach. The experiments attained 
sufficient energy to give values of a ~ 10? cm for which the Rutherford 
formula (8.13) held good. From this Rutherford was able to conclude that the 
positive charge in an atom is concentrated in a nucleus of radius no more than 
10° cm, about one ten thousandeth of the known diameter of an atom. 

The Rutherford cross-section (8.13) is infinite at @ = 0. As (8.11) shows, 
the Coulomb force gives some scattering no matter how large the impact 
parameter. In atomic scattering, however, the concentrated charge of a 
nucleus is screened by the cloud of atomic electrons electrons surrounding it. 
Consequently, the atom will appear neutral and the scattering of a particles 
will be negligible for impact parameters greater than the radius of an atom 
(about 10 * cm.). Coulomb’s law provides a good description of the atomic 
force only for a particles that penetrate the electron cloud. We have seen that 
the electrons themselves cannot significantly scatter an a particle, because 
they are so much lighter. 


Lab and CM Cross Sections 


So far we have evaluated scattering cross sections only under the assumption 
that the target is stationary. Target recoil is most easily taken into account by 
evaluating the cross section in terms of the center of mass (CM) variables and 
then transforming the result to LAB variables. We have seen the 2-particle 
scattering problem is reduced to an equivalent 1-particle problem by using 
CM variables. 

The relation between the LAB scattering angle 6 and the CM scattering 
angle © was determined in Section 4-7. In particular, the scalar part of 
Equation (7.20) gives the relation 


m,/m, + cos © 


Q = 
tive [1 + (m,/m,) + 2(m,/m,) cos O]'? 


(8.14) 
LAB scattering through an angle @ into dw = sin 6 d6 d@ corresponds to CM 
scattering through an angle © into dQ = sin © dO d@. According to (8.2), 
therefore, the lab scattering cross section do/dw is related to the CM scatter- 
ing cross section do/dQ by 


do _ do dw _ do d(cos 6) 


dQ dwd2 dw d(cos ©) ~ 


(8.15) 


From (8.14) we obtain, after some algebra, 
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dw _ d(cos@) _ 1 + (m,/m,) cos © 


dQ  d(cos ©) [1 + (m,/m,)° + 2(m,/m,) cos O}°? 


(1 + (m,/m,) sin 6)'? 


iCLmeesee dann ysner 
For m, = m, this reduces to 
= = (8.17) 


d2 4cos+®@ 4cos@- 


Recall from Section 8-7 that for m, > m, there are two distinct CM deflec- 
tions for each lab angle 6, but the experimenter can distinguish between them 
by an energy analysis of the scattered particles. Of course, for a heavy target 
(m, < m,), the expression (8.16) reduces to dw/dQ ~ 1, so the lab and CM 
cross-sections are nearly equal. 

The expression (8.16) for the factor dw/d@ in (8.15) shows that even 
when the CM cross section is simple, the angular dependence of the lab cross 
section can be quite complex. Consider for example, the CM scattering of 
smooth hard spheres illustrated in Figure 8.4. The adjectives “hard” and 


Fig. 8.4. CM collision of hard spheres. 
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“smooth” express the assumption that the collision does not excite any 
significant internal vibration or rotation of the spheres. It should be evident 
from Figure 8.4 that the CM scattering of spheres with radii R, and R, is 
equivalent to the scattering of a particle from a stationary sphere of radius 
R = R, + R,, as illustrated in Figure 8.4. According to (8.9), therefore, the 
CM cross section has the constant value do/dQ = +R’, and, by (8.15) and 
(8.17), the LAB cross section for the equal mass case is 


—= R* 6, 8.18 
= cos (8.18) 

where 0 < 0 < } a, because @ = +© when m, = m,. Thus, all the scattering 

is in the forward direction. 

4-8. Exercises 

(8.1) For a particle with mass m, scattered by a hard-sphere with mass 


m,, show that the angle of incidence a is related to the angle of 
reflection a’ by 


m,/m, 
tara’ = 1. }ran a. 
m,— mM, 
(8.2) In proton-proton scattering, the incident particles scattered cannot 


be distinguished from recoiling targets. Show, therefore, that for 
classical Coulomb scattering the angular distribution of protons 
detected in the LAB should be given by 


do _ e 
dw \ 2E, 
(8.3) The CM energy distribution of scattered particles with final LAB 
energy E is given by 
do do d(cos ©) 


dE d(cos ©) dE 


i (sin* 6 + cos* 6) cos @. 


Show that for hard-sphere scattering 


do (m,+m,y o 
dE 4mm, €&£, ’ 


and for Coulomb scattering 


do _ ma | q1 42 ) 
dE mE, 


il 7h, <2 
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(8.4) 


The screening of nuclear charge by atomic electrons can be taken 


into account in a rough way by using the Cutoff Coulomb force 
defined by 


f= hh for r<R, 


fc) for Tr RR: 


For a stationary target, use the eccentricity conservation law of 
Section 4-3 to derive the following expression for momentum trans- 
fer to the scattered particle 


' b— P 14,92 (R° — 5°)" A ay A b ee) 
P -P = —>ERi aeapmes (© Pot Pee 


where b is the impact parameter and E = p*/2m is the energy. 
Express this result in terms of the scattering angle 6 and derive the 
deflection function 


b* = _— oe 
(Ged 1) tan 0 


| 4\q> |/2E. Derive the differential scattering cross sec- 


where a 
tion 
do (a' +R") 


do  4[R? + a\(a" + 2R°) sin? FOP | 


Note that this reduces to the Rutherford cross section for R >> a 
and to the hard-sphere cross section for R < a. What is the total 
cross section? 


Chapter 5 


Operators and Transformations 


This chapter develops a system of mathematical concepts of great utility in all 
branches of physics. Linear operators and transformations are represented in 
terms of geometric algebra to facilitate computation. The group theory of 
rotations, reflections and translations is discussed in detail. The most import- 
ant result is a compact spinor representation of finite rotations, which is 
shown to be a powerful computational device. This representation is used to 
develop the kinematics of rigid motions, which, in turn, is applied to the 
description of reference frames and motion with respect to moving frames. 


§-1. Linear Functions and Matrices 


Linear functions arise so frequently in physics that it is worthwhile to study 
their mathematical properties systematically. 
A function F = F(X) is said to be linear if 


F(aX + BY) = aF(X) + BF(Y), (1.1) 


where a and # are scalars. The condition (1.1) is equivalent to the two 
independent conditions 


F(X + Y) = F(X) + F(Y), (1.2a) 
F(aX) = aF(X). (1.2b) 


We have been using a variety of linear functions all along, of course. For 
example, the function s(x) = a-x is linear function of a vector variable. The 
linearity of this function comes from the distributive property of the inner 
product; thus, 


s(ax + By) = a-(ax + By) = a-(ax) + a-(fy) 


| 


= a({a-x) + Blary) = as(x) + Bs(y). 
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Similarly, a linear bivector-valued function of a vector variable is defined by 
b(x) = aax. And a spinor-valued function of a vector variable is defined by 
S(x) = s(x) + b(x) = ax. In general, the sum of linear functions with the 
same domain is also a linear function. ; 

We shall be concerned primarily with vector-valued linear functions of a 
vector variable, more specifically, with linear functions which transform (or 
map) vectors in Euclidean 3-space ¢’, into vectors in ¢,. Such functions are 
commonly called linear transformations, linear operators or tensors. Strictly 
speaking, the tensors we deal with here are tensors of rank 2, but we can 
ignore that, since we will not need a more general concept of tensor. We 
have, of course, encountered specific tensors before; for example, the projec- 
tion 


P(x) = a'a-x = +(x + a''xa). (3) 


and its generalizations introduced in Section 2-4. 

Although the terms “linear transformation” and “‘tensor” refer to math- 
ematical functions of the same kind, they are not completely synonymous, 
because they have different connotations in applications. The term ‘“‘tensor”’ 
is used when describing certain properties of physical systems. For example, 
the inertia tensor is a property of a rigid body to be discussed in Chapter 6. It 
is never called the “inertia linear transformation”. On the other hand, the 
term “transformation” generally suggests some change of state in a physical 
system or an equivalence of one system with another. The term “‘linear 
operator” is fairly free of such connotations, so it may be preferred when the 
emphasis is on mathematical structure. 

To handle linear transformations efficiently, we need a suitable notation 
and formulation of general properties. For a linear transformations it is a 
common practice to write f(x) = fx, allowing the parenthesis to be dropped in 
writing f as a function of x. Accordingly, the composite function 9( f(x)) of 
linear functions f and g can be written in any one of the forms 


9( f(x) = 9 Fx) = of (x) = gfx. (1.4) 


The composite gf of linear operators is often called the product of g and f. 
There is some danger of confusing this kind of product with the geometric 
product AB of multivectors A and B, because we will have occasion to use 
both kinds of product in the same equation. However, to help keep the 
distinction between multivectors and linear operators clear, we shall usually 
use script type to denote linear operators. An important exception to this 
convention is the most elementary kind of linear transformation 


a(x) = ax, (1.5) 


obtained by multiplying vectors by a scalar a. Here the same symbol a is used 
to denote both a scalar and the associated linear operator. 
Both the product gf and the sum f + g of linear operators are themselves 
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linear operators. The product is associative, that is, for linear operators f, 9 
and h we have the rule of composition 


h(of ) = (hg)f. (1.6) 


From (1.2a) it follows that the operator product is also distributive with 
respect to addition; 


h(g + f) =hg + hf. (1.7) 


The product of linear operators is not generally commutative. However, from 
(1.2b) it follows that 


fa = af, (1.8) 


that is, all linear operators commute with the operation of scalar multiplica- 
tion. 

Note that in Equations (1.6, 7, 8) the linear operators can be regarded as 
combining with other operators rather than operating directly on vectors. The 
general rules for adding and multiplying operators are the same as rules of 
elementary scalar algebra, except for the restrictions on the commutative law. 
For this reason, the study and application of linear operators is called linear 
algebra or operator algebra. 

The reader cannot have failed to notice that the abstract algebraic rules 
governing linear algebra are identical to rules governing geometric algebra. 
This identity is no accident. Every specific linear operator can be constructed 
from multivectors using the geometric sum and product alone. Equation 
(1.3), for example, gives the construction or, if you will, the definition of 
projection operators in terms of geometric algebra. We will find similar 
constructions for all the important linear operators. It will be apparent then 
that the associativity (1.6) and distributivity (1.7) of linear operators can be 
regarded as consequences of the associativity and distributivity of the geo- 
metric product. Thus, linear algebra can be regarded as an important applica- 
tion of geometric algebra rather than an independent mathematical system. 

Let us now turn to the development of some general concepts useful for 
characterizing and classifying linear operators. 


Adjoint Operators 


To every linear operator f on ¢, there corresponds another linear operator f 
on ©, uniquely defined by the condition that 


y-f(x) = f(y)-x (1.9) 


for all vectors x and y in €,. To emphasize that f operates before the inner 
product, we may write (1.9) in the form 


y-( fx) = (fy)-x. (1.10) 
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The operator f is called the adjoint or transpose of f. Its significance will 
become clear after we have seen how it can be used. 


Outermorphisms 


Recall from Section 2-3 that a vector space ¢’, generates a geometric algebra 
G, with ¢', = (,),. We shall now show how every linear transformation f on 
€, induces a natural linear transformation f on %,. We define the induced 
transformation f(x ay) of a bivector xay by 


flxay) = fx)a fly) = (fx)aC(fy). (1.11) 


Thus, f transforms bivectors into bivectors. The fact that it is a linear 
transformation of bivectors follows from the linearity of the outer product. In 
particular, the distributive rule for the outer product gives 


f(xay + xz) = f(xay) + f(xaz) (12) 


Naturally, the induced transformation of a trivector into a trivector is 
defined by 


f(xayaz) = (fx)a(fy)a( fz) (1.13) 


Since every trivector is proportional to the dextral unit pseudoscalar 1, we can 
write 


f(xayaz) = (det f)xayaz, (1.14) 


where det f, called the determinant of f, is a scalar depending on f. Since 
XAYAZ is the oriented volume of a parallelepiped with “‘edges”’ x, y, z, we can 
interpret (1.14) as an induced change in scale of the volume by the factor det 
f. If det f is negative, then the orientation as well as the magnitude of the 
volume is changed, 

Supposing that xayaz = ix:(y X z) is not zero, we can solve to get several 
equivalent expressions for the determinant: 


ee ae, 


XAYAZ x‘(y X z) 
= (z'ay'ax'): f(xayaz)- (1.15) 


The first equality can be regarded as a definition of the determinant. This is 
consistent with the more general definition of a determinant given in Exercise 
(1.8). 

The induced transformation of a trivector is simpler than that of a bivector, 
because it involves a change of scale only. However, since xy is a directed 
area, the linear transformation f(xay) can be interpreted as a change of scale 
together with a change in direction of the directed area. 

We can extend the induced transformation of the entire geometric algebra 
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by adopting the notation fx = fx for vectors and defining the induced trans- 
formation of a scalar a by 


fla) =a. (1.16) 
Then f is defined uniquely on all multivectors X, Y, . . . in %,. To sum up, the 
operator f has the following properties: It is linear, 

f(aX + BY) = af(X) + BAY); (1.17a) 
It is grade-preserving, 

fCXD,) = AX) (1.17b) 
and it preserves (i.e. commutes with) outer products, 

FXAY) = fX)afY). Cle) 
It does not preserve the inner product, that is, f(X-Y) is not generally equal 
to f(X)-f(Y). 


The transformation f induced by f is called an outermorphism of %,. The 
root “morphism” is widely used in mathematics with reference to functions 
which preserve some sort of mathematical structure. Thus, the name ‘‘outer- 
morphism” expresses the fact that f preserves the outer product. 

Since the adjoint a of f is also linear transformation, it too induces an 
outermorphism, which we designate by the same symbol f and define by 
writing 


T(x, A. ARDEA... ona, (1.18) 


Obviously, the general properties of the outermorphism f are the same as 
those written down for if ine(l-1 7a, b,c). 


Nonsingular Linear Operators 


A linear operator f on ¢, is said to be nonsingular if and only if det f # 0 or, 
equivalently, f(i) # 0. 

If f is a nonsingular linear operator, there exists a linear operator f ', called 
the inverse of f, such that 


pa (1.19) 
where | is the identity operator defined, in accordance with (1.5), by 

I(x) =x (1.20) 
Thus, for any vector x in €,, we have 

f f(x) =x. (1.21) 


The inverse operator can be computed from f by using the equation 


Fy) = Fo Fo = (1.22) 
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which is obviously valid only if f is nonsingular. Note the role of the adjoint 
outermorphism and the double dual in (1.22). The right side of (1.22) shows 
that f '(y) is obtained from the induced transformation of the bivector yi dual 
to y. 

To prove (1.22), we employ the factorization i = ¢,0,0, of the pseudo- 
scalar and proceed as follows: 


xf (i) =x-[(fa,)a(fo,)a(fo,)] 
=x-(fo,) f(a,a0,) — x:(fo,) f(a,a0,) + x(fa,) F(o,A0;) 
= (fx):6, flio,) + (fx)-e, flis,) + (fx)-6, flia,) 
= f(ifx). 


Dividing this by f(/) and using the defining identity (1.21), we get (1.22) as 
required. The student should carefully consider the justification for each step 
in this proof. 


Matrix Representations of Linear Operators 


For some kinds of computation it is convenient to employ a standard basis o,, 
o,, 0, for ©, defined by the orthonormality condition 


6; 6, = 0; (123) 
(i, j = 1, 2, 3), and the relation 

6,,AG,,A0, = 6,6,6, =i (1.24) 
of the base vectors to the dextral unit pseudoscalar. 

Any vector x in €, can be expanded in a standard basis; thus, 

x= > O,.X, = ot OX, . (1.25a) 
The scalar components x, in the expansion (1.25a) are given by 

X, = O°, (igzsb) 


fone =alio2 ads 

A linear operator f transforms each vector o, in the standard basis into a 
vector f, which can be expanded in the standard basis, as expressed by the 
equation 


f, = fo = J fix (1.26a) 
if 


Each of the scalar coefficients f,, is called a matrix element ot the operator f, 
and the set of all such matrix elements denoted by [f] = [f,] is called the 
matrix of f in the standard basis. The matrix is called a 3 x 3 matrix to 
indicate the range of the indices /, k = 1, 2, 3. The matrix elements of f are 
given by 
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fem Gs) = GA. (1.26b) 


The complete matrix can be written as an array of the matrix elements in the 
following way: 


fi fo fs 
(f1 = (fel = fa tes r 


6 ‘ : ae ; : 
A linear operator is completely determined by its matrix in a given basis, 
for the matrix determines the transformation of the basis by (1.26a), which, in 
turn, determines the transformation of any given vector. 


fx = ~ ( fo,)x, = ps ~ OFX - (127) 
J 
Consequently, the equation 
fx=y (1.28a) 
is equivalent to the matrix equation 
~ yo es (1.28b) 


which is actually a set of 3 simultaneous equations obtained by dotting (1.28a) 
with each of the vectors o; and using (1.27). This can be expressed by writing 
the matrix equation (1.28b) as an array of the form 


line fete x, [ita Xtal eb se, yi 
le iD ie x2] = ie Xie ie ee dbs || = |) 3% || 6 
Tadads x3 tree 7, + foe y3 


The set of 3 X 3 matrices can be made into a matrix algebra which is 
equivalent to the linear algebra of operators on €,. Thus, the operator sum 
f + g corresponds to the matrix sum 


Tic Sie = 0; ($0, + 99,):- (1.29) 
The operator product gf corresponds to the matrix product 
Dees 4 ee): (1.30a) 
j 
since 
of, = Dg (9o,)fix = 6( > Bi fix). (1.30b) 
J i 
Thus, the product of matrices is equal to the matrix of the operator product: 
(9) (f) = lof). (1.30c) 


According to (1.21) and (1.23), the identity matrix corresponding to the 
identity operator is determined by 


6; 1(0,) = 6/0, =5 x. (1.31) 
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Consequently, the operator equation 1f = f corresponds to the matrix equa- 
tion 


% Oi Six = fins or Eales hil (1.32) 
And the equation ff = 1 corresponds to 
=f ihe = Ox, or [FCF] = (1). G33) 


Any other operator equation can be converted into a matrix equation in a 
similar way, and vice-versa. 

For any matrix [f] we define the determinant as equal to the determinant 
of the corresponding operator f. Thus, using (1.26b) in (1.15) we can write 


det f = det [f] = det (¢;f,) 
= (0,A 6,A0,):(f, Af, Af,) (1.34) 


The value of the determinant is a scalar which can be computed from the 
matrix elements by the Laplace expansion (Exercise (1.9)). 

Matrix algebra is widely used in mathematics and physics to carry out 
calculations with linear operators. Since the matrix elements are scalar, 
matrix algebra has the advantage of reducing all such calculations to addition 
and multiplication of real numbers. It has the disadvantage, however, of 
requiring that a basis be introduced which may be quite irrelevant to the 
problem at hand, and this often obscures the geometrical meaning of the 
transformations involved. 

Geometric algebra is a more general and efficient computational tool than 
matrix algebra. In the next two sections we shall see how the most important 
linear transformations can be expressed in terms of geometric algebra so that 
computations can be carried out without introducing an arbitrary basis. This 
is not to say that we shall dispense with matrix algebra. Rather we shall regard 
it as subsidiary to geometric algebra. In some problems a basis is natural or 
information is given in matrix form, so matrices should be used. In other 
problems, which we shall formulate and solve without matricies, the results 
will be put in matrix form for comparison with standard treatments using 
matrix algebra. Matrix algebra itself is simplified and clarified when used in 
conjunction with the operations of geometric algebra, because geometric 
algebra enables us to operate directly with vectors without decomposing them 
into components. 


5-1. Exercises 


(a) Prove that f(xay) = 0 for xay # 0 if and only if f(z) = 0 for some 
nonzero vector z in the xy - plane. 
(1.2) Prove that f(aX) = af(X) when X is a k-blade. 


260 


ais) 
(1.4) 


(1.5) 


(1.6) 


(1.7) 


(1.8) 
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Prove that (uav):f(xay) = f(uav)-(xay). Generalize the proof to 
show that det f = det f. 
Prove that the following propositions about a linear transformation 
f on €, are equivalent: 

(a) f is nonsingular. 

(b) f(x) = 0 if and only if x = 0. 

(c) To every vector y there corresponds a unique vector x such 

that y = f(x). 

Prove the following identities: 

(a) det (gf) = (det g) (det f). 

(b) det(f") = (det f)". 
To find the inverse of a linear transformation, Equation (1.15) can 
always be used, but a more direct approach is often better. Find the 
inverse of 


fx = ax + ab-x 


by solving the algebraic equation y = fx for x as a function of y. 
Find the inverse of the linear transformation 


gx = ox + x B= ax+bxXx 


where B = ib is, of course, a bivector. 

The entire treatment of linear operators and matrices in this section 
is easily generalized to vector spaces of any finite dimension. Details 
are given in the book Geometric Calculus (1984), but let us look at 
some of the basic ideas. 

A set of linearly independent vectors a,, a,, . . ., a, is a frame (or 
basis) for n-dimensional vector space. By generalizing the argument 
in Exercise (2-1.2), it can be proved that a,aa,a .. .aa, # 0 is a 
necessary and sufficient condition for the vectors to be linearly 
independent. 

Any matrix of scalars a,, with i,j = 1, . . ., n, can be expressed 
in the form a, = a;b;, where the a; and b, are vectors. The determin- 
ant of the matrix is defined by 


det a, = det a;b; = (a,a . . .Aa,)-(b,A . . . Ab,). 
The determinant is commonly represented as an array; thus, 
GN. . . Oe apb,apb, .. . aeb,, 


Oy, s a,"b, 


Ani Grn a,b, a, b,, 
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(1.9) 


(1.10) 


(1.11) 


The number of rows and columns in a determinant is called its rank. 

All the properties of determinants are consequences of general 
properties of inner and outer products established in Section 2.1. 
Establish the following properties: A determinant 

(a) changes sign if any two rows are interchanged; 

(b) is unchanged by an interchange of rows and columns; 

(c) vanishes if two rows are equal; 

(d) vanishes if the rows are linearly dependent. 
A determinant of rank n can be reduced to determinants of lower 
rank by using the Laplace expansion: 


(a-A.. . Aa QDA - . AB)) 
n v 
= »S (-1)**'a,-b,(a,a. . - Aa)’ (DA. Pyar b, et. Ab,,). 
k=1 


Derive the result. 

Use the Laplace expansion to evaluate the determinant of a linear 
operator in terms of its matrix elements. Specifically, from Equation 
(1.34), derive the result 


det Sx a fiiCiohss — frofos) — fiolfarfss — fafa) ar fas far faa ne fof). 


The equation 
ant aa... + aay —.c 


can be solved for the scalars a, in terms of the vectors a, and c if the 
a, are linearly independent, that is, if A, =a,a...aa, #0. 
Derive the solution 


DAs ale Aa, 


Oy. = 
aA... Ad, 


where (c), indicates that ¢ has been substituted for a, in the product 
aA... Aa,. Suppose that B, = b,a ... Ab, # 0 is proportional 
to A,,. Derive Cramer’s rule, expressing the a, as a ratio of deter- 
minants: 


(iA... ADGA... (Gee . wap | 


—s (b,A .. AD) GA. . . Aap) 


Frames and Reciprocal Frames 


A frame {e,, k = 1, 2, 3} of vectors in %, (i) determines a pseudo- 
scalar e,ae,Ae, which is necessarily a non-vanishing scalar multiple 
of the righthanded unit pseudoscalar i; thus, e,ae,Ae, = ei. The 
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(i412) 
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determinant of the frame e = — ie,ae,ae, is positive (negative) if the 
frame is right (left) handed. The reciprocal frame {e*, k = 1,2, 3} is 
determined by the set of equations 

ee, = Jf, 


for j, k = 1, 2, 3, where 6f = 1 if k = j and 6} = 0 if k # j. Show 
that the unique solution of these equations are given by 


a See 2 © x e; 
€, AC, Ae, e , 
saw. €,Ae, _ @&, Xe, 
€, AG. AC, e i 
: e€,Ae, @, 2% Os 
ee =— 2 _ = 
€,AC,Ae, e 


Note that the orthonormal frame {o,} is reciprocal to itself. 
Any vector a in &, (i) can be expressed as the linear combination 


a= 7 e+ oe, 2.0 €, — 0 6, 


where the summation convention has been used to abbreviate the 
sum on the right. The scalar coefficients a are commonly called 
contravariant components of the vector a (with respect to the frame 
{e,}). The reciprocal frame simplifies the problem of determining 
these coefficients from a and {e,}. Show that 


and that this solution is merely an applications of Cramer’s rule 
(Exercise (1.11)). Similarly show that the covariant components a, 
of a, which are defined by the equation a = a,e*, are determined by 
the equations a, = e,'a. 

Let «,, be an n-dimensional vector space with an orthonormal basis 
6,, o,,... 6, and pseudoscalar i = o,0,... ¢,. For a linear 
operator f on ¢,,, the matrix elements of the adjoint operator f are 
given by 


fi = o;'(fo;) id (10 ;) ei 
Thus, the matrix element fis is obtained simply by transposing the 


indices on f,,. The transformation of the basis by f is therefore given 
by 


f, = fo, = 2 4; fn = 2 fay: 


Show that the matrix elements of the inverse operator f ' are given 
by the ratio of determinants 
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fi = SEER 7) ed 
Areal: 
— (GA... AG) GA .. (Ga. wha 
det fi, 


where (o,), indicates that f, has been replaced by a;,. 


5-2. Symmetric and Skewsymmetric Operators 


In this section we study the properties of symmetric and skewsymmetric linear 
operators. Although the mathematical results have many physical appli- 
cations, they will be needed in this book only to determine the forms of 
inertia tensors for rigid bodies. So the student can skip this section until that 
information is required. 

A linear operator <S is said to be symmetric (or self-adjoint) if S = <S, that 
is, if it is equivalent to its adjoint. Similarly, a linear operator .# is said to be 
skewsymmetric (or antisymmetric) if &= —. Indeed, any linear operator f 
can be uniquely expressed as the sum of a symmetric operator f, and a 
skewsymmetric part f_. One simply forms the operator identity 


Pie i) +8 =f). 


Hence, 

{=f outs (2.1a) 
where 

f= 7a: (2.1b) 


Skewsymmetric Operators 


We consider skewsymmetric operators first, because they are so easy to 
characterize completely. Indeed, any skewsymmetric transformation #@ can 
be put in the canonical (or standard) form 


AX=x-A, (2) 


where A is a unique bivector. All the properties of are therefore deter- 
mined by the algebraic properties of the bivector A; the skewsymmetry, for 
example, follows from 


y:(fx) = y(xA) = (yax)-A = —-(xay)-A = —x-(y-A) = —x-(y). 
We can prove (2.2) by using the fact that f is completely determined by the 


transformation a, = so, = 26,A, of a standard basis. In terms of the 
standard basis, the bivectors ‘A is given by 
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A=Zy0,Aa =7 > 6), A(L4;,) =2 aLAG Ay : (2.3) 
This ts proved by 


2 0;(0,A a) = z 2 (;°6, a, — 6,6;°a;) 


N= 


eA — 
in 2% (Ona, — 6) = = (fo; - A6;) a BO; 


This establishes (2.2) for a standard basis,-whence, by linearity, the result is 
generally true. 

By way of example, note that the magnetic force on a charged particle is a 
skewsymmetric linear function 7 = /v of the particle velocity. Thus, 


S i iB 
7 flew x B= =-tenp =v.(- 2B). 
C C c 
Another important skewsymmetric operator will be seen to arise from dif- 
ferentiating a rotation. 


Eigenvectors and Eigenvalues 


If a nonzero vector e is transformed into a scalar multiple of itself by a linear 
operator f, we have the equation 


feSAe, (2.4) 


where A is a scalar. We say that e is an eigenvector of f corresponding to the 
eigenvalue A. Obviously, any nonzero scalar multiple of e is also an eigenvec- 
tor of f. The problem of finding the eigenvalues and/or the eigenvectors for a 
given operator is called the eigenvalue or eigenvector problem. 

The simplicity of the “‘eigenvalue equation” (2.4) shows that very basic 
properties of a linear transformation are described by its eigenvectors and 
eigenvalues. Therefore, it is often important to determine these properties if 
they are not evident from the form in which the transformation is given. For 
example, if we are given the matrix f, of an operator f, then we have the 
vectors 


3 
f, = fo, = 2 Oi fix- 


To develop a general method for solving the eigenvalue problem from this 
information, note that (2.4) can be written in the form 


(f—Aje =0, (2.5) 


showing that the operator (f — A) is singular. But every singular operator has 
a vanishing determinant, hence 
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ea eee eee (2.6) 


This is commonly called the secular equation for f. The left side of (2.6) is a 
third degree polynomial in A, with coefficients composed of the f,,. The reader 
is invited to expand the numerator and show that (2.6) can be put in the form 


V-al’+adA-a,=0, (2.7a) 
where the scalar coefficients are given by 

a, = 2 afi = fest tons (2.7b) 

a, = (o,A0,):(f,af,) + (6,A0,)-(f,af,) + (¢,A0,)-(f,Af,) (257) 

a, = det f = (0,a0,A0,):(f,Af.af,). (2.7d) 


Since the secular equation is an algebraic equation of the third degree, the 
fundamental theorem of algebra tells us that it has at most those distinct 
roots, some of which may be complex numbers. The real roots are the desired 
eigenvalues. Complex roots are also regarded as eigenvalues in conventional 
treatments of linear algebra, but geometric algebra makes this unnecessary, 
as explained below and at the end of Section 5-3. 

After the eigenvalues have been determined, the corresponding eigenvec- 
tors can be found from (2.5). To do this, it is convenient to write (2.5) in the 
form 


Bie toes Seen 0 (2.8a) 
where the vectors 
g, =f, + Ae, (2.8b) 


are known for each eigenvalue A and the scalar components e, = e-o, of the 
eigenvector are to be determined. Equation, (2.8a) can be solved for ratios of 
the e, (Cramer’s rule). Thus, we can “wedge” (2.8a) with g, to get 


€:8,Ag;, + €.8,Ag, = 0. 


Whence 
2 — 83481 ‘ (2.9a) 
ey £.A83 

Similarly, 
3 _ BAB (2.9b) 
e, £,A8; 


Since the length and sense (or orientation) of the eigenvector e is not 
determined by the eigenvector equation (2.5), we are free to: fix them by 
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assigning any convenient value to the component e,; then e, and e, are 
uniquely determined by (2.9a, b). 

If A is a single root of the secular equation, then 2 of the 3 vectors 
g, = f, — Ao, are necessarily linearly independent, and only one component 
of the eigenvector e can be specified arbitrarily, as we have seen. However, if 
A is a double root of the secular equation, the g, are not linearly independent 
and two components of e can be specified arbitrarily. In this case, Equations 
(2.9a, b) cannot be used to obtain e, and e,. However, we are free to set 
€, = e, = 1, so going back to (2.8a) we get 


g,+g,+e,g, =0. (2.10a) 


from which e, can be obtained trivially. Alternatively, we can set e, = 1 and 
e, = 0, so (2.8a) reduces to 


g, + eg, — 0. (2.10b) 


The eigenvector we get from (2.10b) is linearly independent of the eigenvec- 
tor we get from (2.!0a). Any other eigenvector obtained by a different choice 
of components will be a linear combination of these two eigenvectors. Thus, 
the eigenvectors corrresponding to a double root of the secular equation torm 
a plane, so the eigenvector problem is solved when two independent vectors 
in that plane have been found. 

A secular equation with a multiple root is said to be degenerate; more 
specifically, it is said to be k-fold degenerate if the root has multiplicity k. To 
an eigenvalue with multiplicity k there corresponds exactly k linearly inde- 
pendent eigenvectors, which can be found in the manner described for k = 2. 


Example 


Let us see how the general method works on a specific example. Let us solve 
the eigenvalue problem for the linear transformation specified by the matrix 


aati | 
(ae | 4-1). 
Pe ilteed 


Operating on a standard basis, this matrix gives 
fo, = 40, -6,- 06, = f,, 
fo, =-0, + 40,-0,=f,, , (210) 
fo, =-6,- 4, + 40, = f,. 

From these vectors we calculate 


f,af, = (46, — 6, - o,)a(—o6, + 40,-<4,), 
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which after expansion and collection of like terms takes the form 
fiaf, = 15a,ac, + 56,00, + Say,aao,. 
Similarly, we find 
f,af, = 50,aa, + 150,06, + Sa,Ae,, 
f,af, = 5e,nc, + 5e,Ao, + 150,A0,, 
as well as 
f,Af,af, = 500,A0,A0,. 


We use these multivectors in (2.7b, c, d) to evaluate the coefficients in the 
secular equation; thus, 


a,=4+4+4= 12, 
a,= 15+ 15+ 15=45, 
a, = 50. 
Hence the secular equation (2.7a) takes the specific form 
A’ — 12/7 + 45A — 50 = 0. 
This polynomial has the factored form 
(A -2) (A-5)? = 0. 


Hence the eigenvalues are 2 and 5 with double degeneracy. 
To determine the eigenvector corresponding to the eigenvector A = 2, we 
. use (2.11) in (2.8b) to get 


g, =f, —20, = 20,-6,-<4, 

g, = f, -26, =-—o, + 20,-4, 

g, = f, - 20, = —o,- 6, + 2¢,. 
From this we obtain 


£,AZ, = 3(a,Aa, + 0,Aa, + 6;A0,) 


= $,A8; = 83:48. - 
Using this in (2.9a, b) with e, = 1, we get e, = e, = 1. Hence, 
e,=o0,+o,+ 4, (2al2) 


is the desired eigenvector. 
To find an eigenvector corresponding to A = 5, we evaluate g, = f, — Se, 
and find that 


£=28=8 =-(6,+4,+4,). 
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Using this in (2.10a), we find e, = —-2 when e, = e, = 1; hence, 


e, = 6, + 6,-20, (2.13a) 

is an eigenvector. On the other hand, from (2. me) we find that e, = —1 when 
= | and.¢,:=0> hence, 

€, =6,-4,. : (2.13b) 


is another eigenvector. Therefore, every vector in the plane determined by 
the bivector. 


e,Ae, = 2(a, — 6;)A(a, — 4;) 


is an eigenvector with eigenvalue A = 5. 

The method we have developed for finding eigenvectors and eigenvalues is 
sufficiently general to apply to any problem. However, the generality of the 
method can be a drawback, because it may require more work than necessary 
for special problems. For example, it often happens that an eigenvector is 
known at the beginning. In this case it would be foolish to use the secular 
equation to find the eigenvalue. Rather the eigenvalue should be obtained 
directly from 


A= ie =e’-(fe) . (2.14) 

Often it is easy to identify an eigenvector from symmetries in the given 

information. Thus, perusing (2.11), we see that if we add the three equations 
we get 


Gn Oy tay) Oe ao, toe.) 


This tells us immediately that 2 is the eigenvalue corresponding to the 
eigenvector e = o, + o, + o,, in agreement with what we found by the 
general method after much labor. It may be a little more difficult to identify 
the eigenvectors (2.13a) and (2.13b) by examining (2.11). But remember, any 
other vectors in the e,ae,-plane will serve as well. Actually, as will be proved 
below, all we need to do is to find a vector orthogonal to e,. Thus, we can 
write e, = 6, + o, + e,0, and choose e, so that 


ee; = (o, + o, + ¢;):(o, + 0, + €,6,) =2+¢,=0. 
Clearly e, = —2, so e, = 6, + o,-26,, in agreement with (2.13b). From 
(2.11) then, we find fe, = 5e,, so the eigenvalue is 5. The vector 

e, X e, = —3(6, — 4,) 


is orthogonal to both the eigenvectors e, and e, and is, in fact, proportional to 
the eigenvector (2.13b). 

In the example just considered, all the roots of the secular equation are 
real. To understand the significance of complex roots, consider the skewsym- 
metric transformation 
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fx =i-x =-xi, 


where i = o6,¢, is a unit bivector. Operating on a standard basis we get 


jo, —10,—— a, (2.15a) 

fo, =ie,=<a,, (2.15b) 

fos 0), (2.15c) 
It is readily shown that the secular equation for this transformation is 

AA’? + 1) = 0. 


The root A = 0 corresponds to the eigenvector o, in (2.15c). The point of 
interest, however, is that the roots of A* + 1 = 0 are ‘“‘imaginary”’, and it is 
natural to identify them as bivectors A = + i, since they must be related to 
(2.15a, b) have the form of the eigenvalue equation (2.4), with o, and o, as 
eigenvectors and the bivector i as eigenvalue. The effect of the ‘imaginary 
eigenvalue” i is to rotate the eigenvectors a, and a, by 90°, so we conclude 
that, in general, complex roots in the secular determinant indicate that 
rotations are involved. 

Although ‘‘complex roots”’ of the secular equation can be interpreted in the 
manner just described, we shall continue to regard only real roots as eigen- 
values, because an analysis of eigenvalues is not the best way to approach 
problems in which eigenvalues are complex. Already we have developed a 
general method for finding the canonical form of a skewsymmetric transform- 
ation which is clearly superior to the “‘method of eigenvalues’. In Section 5-3 
we shall come to a similar conclusion about the best method for handling 
rotations. By then it should be evident that the “method of eigenvalues” is 
best reserved for symmetric operators, to which we now turn. 


Symmetric Operators 


The terms principal vectors and principal values are sometimes used for the 
eigenvectors and eigenvalues of a symmetric operator. The scalar multiples of 
a principal vector compose a line called principal axis of the operator. A 
principal axis is thus a set of equivalent principal vectors. 

The chief structural property of symmetric operators is described by the 
following fundamental theorem: Every symmetric operator on «, has three 
orthogonal principal axes. This implies, of course, that all three roots of the 
secular equation for a symmetric operator must be real. Let us accept this 
much without proof and see what it implies about the principal vectors. If e, 
and e, are principal vectors of a symmetric operator ©, then we have 


Se, = A,e,, (2.16) 
de, = A,e,. 
Dotting these equations by e, and e, respectively, we obtain 


A,e,"e, = e,(Se,) = e,-(Se,) = 1,€,"e,. (277) 
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This implies that e,-e, = 0 if A, # A,. Thus the principal vectors of a sym- 
metric operator which correspond to distinct principal values are necessarily 
orthogonal. If 4, A,, then (2.17) tells us nothing, but (2.16) tells us that any 
linear combination of e, and e, is also a principal vector of <5, so we are free to 
choose any combination that gives us a pair of orthogonal principal vectors. 
The third principal axis is given immediately by e, X e,. Furthermore, the 
principal axes are unique if all the principal values are distinct. 

In mathematical terms, the theorem that a symmetric operator 3 has 3 
orthogonal principal axes is expressed by the equations 


Se, =A,e, for k=1,2,3 Cay) 
and 
eve, =O if j#k. (2.18) 


The operator <5 is uniquely determined by the “‘spectrum”’ of its eigenvalues 
and eigenvectors. Indeed, the operator «5 can be written in the canonical form 


3) 
SX = ae Aye, @,°X, (2.19) 
or, more abstractly, 
where 
PX = e,1e,°x (2.20b) 


is the projection of x onto the kth principal axis. The canonical form (2.17) or 
(2.20a) is sometimes called the spectral decomposition or spectral form of a 
symmetric operator, by analogy with the decomposition of light into a 
spectrum of colors. Note that the eigenvalue equations (2.17) follow trivially 
from the spectral form (2.19), so (2.19) can be regarded as the result of 
solving (2.17) for the operator 3 in terms of the A, and the e,. 

From the spectral form (2.20a) for a nonsingular symmetric operator «5, the 
inverse Operator is given immediately by 


Oo = 


ie 
a TF | (2.21) 


To verify this by showing that © '°) = 1, one needs the following basic 
properties of projection operators: 
(a) orthogonality 


PF=0 if p#R, (2.22a) 
(b) idempotence 
P= Pe. (2.22b) 
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(c) completeness 
ie el (2227) 
If the principal values A, are positive, then <S has a unique square root 


= ZA! A (2.23) 


This is a square root operator in the sense that (<S'”)? = 5'7S'? = <, as is 
readily verified. An operator f is said to be positive if x-( fx) > 0 for every 
nonzero vector x. This implies that any eigenvalues of f are positive. So with 
this nomenclature, we can assert that every positive symmetric operator has a 
unique square root which is also a positive symmetric operator. 

A positive symmetric operator ©) can be given a geometric interpretation by 
considering its effect on vectors in a principal plane (i.e. a plane determined 
by two principal vectors). As Figure 2.1 shows, <5 transforms (i.e. stretches) 

the points on a square 

into points on a paral- 

lelogram. Similarly, <5 

e,e,plane transforms circles into 

ellipses. In particular, 5 

stretches the unit circle 

we into an ellipse for which 

the lengths of the semi- 

axes are the principle 

values of ©. A positive 

symmetric operator 35 

on ¢, transforms the 

unit sphere into an ellip- 
soid, as specified by 


Fig. 2.1. Symmetric transformation with principle values A, = 
3/2, A, = 2/3. 


= Su, (2.24) 


where u is any unit vector. This is a parametric equation for the ellipsoid with 
parameter vector u. We obtain a nonparametric equation for the ellipsoid by 
elliminating u as follows: 


(S" x) sau? = 1. 
Since «S ' is a symmetric operator, this equation can be put in the form 
x(o- x) =1, (2.25) 
where «S* = -) 'S '. Using the spectral decomposition of «5 ' (see Exercise 


(2.25)), we can write (2.21) in the form 


a 
—+_$—~4—-—=1, 2.26 
A, | Oe HG ( ) 


Die Operators and Transformations 


where x, = x'é,. Equation (2.26) will be recognized as the standard “‘coordi- 
nate form” for an ellipsoid with ‘“‘semi-axes”’ A,, A,, A, (Figure 2.2). 

We have now found a canonical form for arbitrary symmetric operators and 
supplied it with a geometrical inter- 
pretation. In some problems the 
eigenvectors and eigenvalues are 
given in the intial information so an 
appropriate operator can be con- 
structed directly from its spectral 
form. We shall encounter variants of 
the canonical form which are more 
convenient in certain applications, 
but all variants must, of course, be 


constructed from the eigenvectors 
and eigenvalues. Fig. 2.2. An ellipsoid with semi-axes A,, A,, A. 


The Eigenvector Problem in 2 Dimensions 


We have seen that the “secular method” for solving the eigenvector problem 
can be quite laborious. For operators acting on a 2-dimensional vector space 
there is an easier method, which we now derive. 

For a positive symmetric operator <5 on a plane ¢’,, the eigenvector equa- 
tions can be written 


Oe) — Ae, (2227) 


where e. and e are the principal vectors corresponding to the principal 
values A, and A_ respectively. 

We assume that <S is known, so its action Su on any specified unit vector u 
in the plane can be determined. Now write e = é€, and decompose u into a 
component u collinear with e and a component u, orthogonal with e. Then 
we can write 


Su = O(u, + u,) =A,u tAu, 
= A.ee-u + A_eeau = +A,(u + eue) + +A (u— eue). 
Therefore. 
Su=t(A, +A )uts(A, -A )eue. (273) 


The angle ¢@ between the unit vectors e and u is given by the equation 
ve=e'* or e=ue'?, (2.29) 


where i is the unit bivector for the plane. Therefore, Equation (2.28) involves 
three unknowns A,, A. aud @, so we need another equation before we can 
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solve for them. This is most easily obtained by operating on the vector 
ui = —iu which is orthogonal to u. Thus, from (2.28) we obtain 


iS(ui) =+(A, + A)u — +(A, -A Jeue (2.30) 
Combining (2.28) and (2.30), we get 

u, = Su + iS(ui) = (A, + A_)u (2.31a) 

u_ = Su — id(ui) = (A, —A_)eue (2.31b) 


Without loss of generality we may assume A, = A , so (2.31a, b) shows that 
the principal values are determined by the magnitudes | u, | = A, + A of the 
known vectors u, and u . In addition, we obtain the unit vector equation 
ii = eue from (2.31b). When reexpressed in the form ei = ue = e’®, this 
tells us that the direction e is half way between the directions U and u = Q.. 
Therefore 


e, = a(t, + a) (2:32) 


is an eigenvector of «<S for any nonzero scalar a. If u,au_ #0, then 
e = a(u, — 0 ) is the other eigenvector we want since e,-e_ = 0. 

Our results are summarized by Mohr’s algorithm: To solve the eigenvector 
problem for a positive symmetric operator «5 on a plane with direction i, 
choose any convenient unit vector u in the plane and compute the two vectors 


u, = Su + id(ui). (2.33a) 
Then, for u,au_ # 0, the vectors 
~=a(i, +0) (2.33b) 
are principal vectors of <5 with corresponding principal values 
fe= > Cua |u|). (2.33) 


(See Figure 2.3) If u happens to be collinear with one of the principal vectors, 
then u,au = 0 and (2.33b) yields that vector only. Of course, the other 
vector is orthogonal to it. 
Principal axes a The principal vectors can be 
found in an alternative manner. 
a Multiplying (2.31a) with (2.31b) 
d and using (2.29), we obtain 


u,u_ = (AZ-A2)e"*, (2.34) 


Whence the angle ¢ is deter- 
mined by 


: u_AU_ 
tan 2¢— ———— . (2°35) 
Fig. 2.3. Parameters in Mohr’s Algorithm. uu 
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Then €, = e is deter- bivector (imaginary) Spinor 
mined by (2.29). 

The name for Mohr’s 
algorithm is taken from 
Mohr’s circle (Figure 
2.4), which is used in 
engineering textbooks 
to solve the eigenvalue 
problem by graphical 
means. A parametric 
equation Z = Z(@) for Fig. 2.4. Mohr’s Circle. 
Mohr’s circle can be ob- 
tained directly from (2.28) and (2.29); thus 


Zu) = udu = z(A, +A) + (A, —A) (ue) = Z() 
=4(A, tA) +40, -A)e* (2.36) 


To solve for the unknowns, Z must be known for two values of @. The choice 
corresponding to (2.30) is 


Z,() =Z(¢+2)=+0, +4)-40,-A)e. (2.37) 


The solution of these two equations is of course equivalent to Mohr’s 
algorithm. But the formulation of Mohr’s algorithm by (2.33a, b, c) has the 
advantage of involving vectors only. 


Example 


To demonstrate the effectiveness of Mohr’s algorithm, let us solve the 
eigenvalue problem for the tensor 


Su = aaau + bbau. (2.38) 


As will be seen in Chapter 7, this is a general form for the inertia tensor of a 
plane lamina. Now 


Oa = bbaa = b’a - bab 
and, for anh |aab |i, 

iS(ai) = ifaa-ai + bb-ai] = a° + bb-a 
Hence; for uw’ = ain (2,33a) , 

a, = Sa + idS(ai) = (a? + 57)A, 

a_ = Sa - iS(ai) = (6? - a’)a — 24-bb, 


and 
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| a. | = [(b? — @)?.+ 4(a-b)]!”. 
Therefore, by (2.33c) the principal values are 

A.=@ +b? +[(b’-a@’) + 4(a-b)’]'” (2.39) 
By (2.33b) the corresponding principal vectors are 


2 2 z 
cna Sane: (2.40) 
(b> — a") a (aeh)*| 7 
It will be noted in ‘iis example how the free choice of u in Mohr’s algorithm 
enabled us to simplify computations by taking the special structure of <5 into 
account. 
Unfortunately, there is no known generalization of Mohr’s algorithm to 
solve the eigenvector problem in 3-dimensions. However, whenever one 
eigenvector is already known, Mohr’s algorithm can be applied to the plane 
orthogonal to it. For example, any tensor constructed from two vectors a and 
b necessarily has a X b as an eigenvector. Thus, for (2.38) we find 


S(a X b) = (@ + b’)a Xb. 


5-2. Exercises 


ez) Find the adjoint as well as the symmetric and skewsymmetric parts 
of the linear transformation 


fx = ax + abx + x-A. 


(22) Derive Equations (2.7a, b, c, d) from Equation (2.6). 
(2 Find the eigenvectors and eigenvalues for operators with the fol- 
lowing matrices 
(a) 05 (b) i V6 -V3- 
0 -2 0 Von 2 -5V2 
a Od -V3 -5V2 -3 


(2.4) We write -S” for the n-fold product of an operator «‘ with itself. 
Prove that if <S is symmetric with eigenvalues A,, then <S” is sym- 
metric with eigenvalues (A,)” and the same eigenvectors as <). 

(2.5) A linear operator <S is given by 


S¢6, = 7a, + 26, 
Se, = 20, + 6a, - 2e, 
Soa, = —2a, + 5a, 


Determine its eigenvalues and eigenvectors. 
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(2.6) 


(2.7) 


(2.8) 


(2.9) 


(2.10) 


(2.11) 
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Describe the eigenvalue spectrum of a symmetric operator «) so that 
the equation 
x:(Sx) = 1 
is equivalent to the standard coordinate forms for each of the 
following quadratic surfaces: 
(a) Ellipsoid: 
Me, 4 
a’ b? G 


(b) Hyperboloid of one sheet: 


(c) Hyperboloid of two sheets: 


cee a 
a’ b? eG 
Describe the solution set {x} of the equation 
[f(x-a)}? = 1, 


where f is any linear operator. 
If a, b, ¢ are mutually perpendicular and <‘ is a symmetric tensor, 
prove that the three vectors a X Sa, b X Sb, ¢ X Se are coplanar. 
Prove the basic properties of projection operators formulated by 
Equations (2.22a, b, c), and verify that the inverse of a nonsingular 
symmetric operator is given by (2.21). 
Find the eigenvalues and eigenvectors of the tensors 

(a) Su = aa-u + bb-u. 

(b) Zi = ab-u + bau. 
For an operator f specified by the symmetric matrix 


Fr fa 
fa fo 


with respect to an orthonormal basis 6,, ¢,, show that 
1 2 
A. = Z| Ceri z| (fir — faa)” + 46 ke 


are eigenvalues, and the angle @ from a, to the eigenvector e, is 
given by 


=| 


tan 20 = oe 
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(2.12) Solve the eigenvector problem for the tensor 
Su = aaau + bbau + ceau 


wherea+b+c= 0. 


5.3. The Arithmetic of Reflections and Rotations 


A linear transformation can be classified according to specific relations among 
vectors which it leaves unchanged. Such relations are said to be invariants of 
the transformation. Transformations for which the inner product is an inva- 
riant are called orthogonal transformations. Thus, an orthogonal transform- 
ation f on &, has the property 


Gahy)i= xy (3.1) 


for all vectors x and y in ¢,. On the other hand, the property (1.9) of the 
adjoint implies that 


(fx)-(fy) = x-(ffy), 
This is equivalent to (3.1) if and only if 
f =f". (3.2) 


Therefore, an orthogonal operator is a nonsingular operator for which the 
inverse is equal to the adjoint. 
From (3.1) it follows that 


x = |e (3.3) 


Thus, the magnitude of every vector in ¢’, is invariant under an orthogonal 
transformation. The orthogonality of vectors in a standard basis is another 
invariant. Specifically for f, = fo,, (3.1) implies 


ff, = G; 0; — Oi « (3.4) 


This can be used to prove that the magnitude of the unit pseudoscalar is 
invariant under orthogonal transformations. Since 


f@) = f(¢,¢,0,) = fff, 
we have 
| f@) F = (Ef, Gift) = 1. 
But f(i) = (det f)i. Hence, 
11) | andes) 1. 


and 
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det f= +1. (3.5) 


This condition distinguishes two kinds of orthogonal transformations. An 
orthogonal transformation f is said to be proper if det f = 1, and improper if 
det f = — 1. Proper orthogonal transformations are usually called rotations. 

Our problem now is to find the canonical form for orthogonal transform- 
ations, that is, the simplest expression for an arbitrary orthogonal transform- 
ation in terms of geometric algebra. The general solution can be constructed 
from the simplest examples. 


Reflections 


The simplest nonsingular linear transformation that can be built out of a 
single nonzero vector wu 1s 


(fa = —w'xu = — uxu’ = -axd. (3.6) 


The magnitude of u does not actually play a role in any of these equivalent 
expressions, so we might as well suppose that u is a unit vector and write 


Ux = —uxu. (3.7) 


The effect of this transformation is made clear by decomposing x into a 
component x collinear with u plus a component x, orthogonal to u: 


» On mae. Stee 
where 
xX) = xuu 
and 
X, = XAuu. 
Now, u commutes with x, and anticommutes with x,. Hence, 
uxu = u(x, + x, )u = x,— Xx, , 
and (3.7) yields 
x’ = Ux = — uxu = X, —X;. (578) 


Thus ‘/ transforms each vector x into a vector x’ by reversing the sign of the 
component of x along u. In ¢’, the vector x’ is the “mirror image” of x in the 
plane (through the origin) with normal u, as shown in Figure 3.1. Accord- 
ingly, the transformation (3.8) is called the reflection along u. 

Equation (3.8) obviously describes a linear transformation; in fact, a 
reflection is an improper orthogonal transformation as we shall now show. 
Consider the product of two transformed vectors: 


(Ux) (Uy) = (-uxu) (-uyu) = uxyu. 
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The scalar part of this equation 
gives us 

(Ux):(Yy) = ux:yu = xy, 


which proves that “is orthogonal. 
- The bivector part of the equation 
gives us the outermorphism 


U(xay) = (Ux)a( Wy) 


=u(xay)u. (3.9) 


Fig. 3.1. Reflection of x along u. 


> 


Now, for transformations on ©&,, 
the determinant is obtained from the outermorphism of trivectors, which are 
the pseudoscalars. Thus, from the product of three transformed vectors 


(Ux) (Uy) (Uz) = (-uxu) (-uyu) (-uzu) = — u(xyz)u, 
we take the trivector part to get 
U(x AYAZ) =—U(XAYAZ)U = —XAYAZ. (3.10) 


The last step in (3.10) follows from the fact that all pseudoscalars commute 
with the vectors in ¢’,. From (3.10) it follows that 


det / =-1. (3.11) 


Hence, the transformation is improper as claimed. 


Example 


Reflections occur frequently in physics. For example, for a particle rebound- 
ing elastically from a fixed plane with normal u, the final momentum p’ is 
related to the initial momentum p by 


p =—upu. (2512) 


This, of course, has the same form as (3.8), but the Figure 3.2 which we 
associate with it is a little different than the Figure 3.1 associated with (3.8). 


Now, Equation (3.12) implies that | p’ | = | p |, as required by kinetic energy 
conservation. Consequently, we can put (3.12) in the form 
up’ = — pu. (13) 


We know from Section 2-4 that the product of unit vectors can be expressed 
as the exponential of the angle between them. So (3.13) can be written 


eeu er, (3.14) 


where i is the unit bivector for the plane of reflection and the angles © and O' 
are as indicated in Figure 3.2. 
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From (3.14) it follows 
that the angle of reflec- 
tion 0’ equals the angle 
of incidence 0. This is 
the elementary _ state- 
ment of the “law of re- 
flection”. But a_ full 
description of a reflec- 
tion must specify the 
plane of reflection as 
well as the angles. All 
this is expressed by 
(3.12), which can be re- Fig. 3.2. Elastic reflection from a plane. 
garded as a complete 
statement of the law of reflection. Equation (3.12) provides an approximate 
description of the rebound of a ball from a wall and quite an accurate description 
for the change in direction of light reflected from a plane surface. It provides an 
especially efficient means of computing the net effect of several successive 
reflections, as will be obvious after we have examined the composition of 
reflections. 


Rotations 

Now let us consider the product of the reflection (3.7) with another reflection 
OX = — vxv. 

We have 
UUx = vuxuv. 


This determines a new linear transformation # = (UY of the form 


FR RXR , (3.15) 
where R can be written in the form 

R=uv=uv+ uav=e0, (3.16) 

Rt = vu = wv-uav= eA (3:17) 


The reason for writing + A for the bivector angle between vectors u and v will 
be made clear below. According to Section 2-3, R is a spinor, or quaternion. 
Since 


RtR=1, (3.18) 


it has unit magnitude | R | = 1, and it is said to be a unitary or unimodular 
spinor. Note from (3.16) that the bivector of R can be written 
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CR) = uav = ¢e%4), = sinh +A = Asin +| A |, 
(so it has the direction) 


UAV 


ee (3.19) 


| uav | - 


We are using here important properties of exponential and trigonometric 
functions established in Section 2-5. We shall now show that this bivector 
specifies a plane of rotation. 

To determine the effect of # on a vector x, we’ decompose x into a 
component x, in the A-plane and a component x, orthogonal to the A-plane 


Krak, ky 
where, for A # 0, 
x, = x:AA™ 
and 
x, = xAAA", 
It follows that A anticommutes with x, and commutes with x,, that is, 
XA = x‘A = —A-x = — Ax, 
and 
x,A = xAA = Aax = Ax,. 


Using these relations, we find from (3.16) and (3.17) that 


fox = xR 
and 

Rx, =x,R', 
Therefore, 


Rix Rh (een) R= xy R= xo ae 
Hence, from (3.15) we obtain 
x!) = Ax = e OAgeOMA = x, + x, CM (3.20) 


We have already seen in Section 2-3 that an expression of the form xe“ 
describes a rotation of x, in the A-plane through an angle of magnitude | A |. 
Therefore, the transformation (3.20) can be depicted as in Figure 3.3. 

It is easy to prove that det 7 = | in the same way that we proved det “/ = 
—1. Therefore % is a rotation. More specifically, we say that (3.20) de- 
scribes a rotation by (or though) an angle A. In “, we can write 


A = ia, (3.21) 
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expressing A as the dual of a vector a. Then the direction of a specifies the axis 
of rotation, as shown in Figure 3.3, and the magnitude | a | = | A | gives the 
scalar angle of rotation. It is important to note that the vector a has been 
defined so that the following 
right hand rule for rotation 
applies: If the thumb of the 
right hand points along the 
direction of a, then the ro- 
tation in the plane “follows” 
the fingers, as indicated in 
Figure 3.3. 

The equation % = RtxR 
where Risa unitary spinor is, Fig. 3.3. Rotation by an angle A = ia. 
in fact, the desired canonical 
form for any rotation on ¢’,. We can always express R in the form (3.16), as is 
proved below. For a rotation %x = R'xR, it is usually most convenient to 
express the spinor R in one of the two parametric forms, 


R=a+ ig (3.22a) 
Ph suales (3.22b) 


rather than in the form (3.16). These forms exploit the fact that in (, every 
bivector can be expressed as the dual of a vector. Since 


es = cosza + isinya, 
the parameters a, B and a in (3.22a, b) are therefore related by 

a = cosya=cos+|aj, (3.23a) 
a|. (3.23b) 
The parameter a is, of course, not independent of B, because 


RR = (ap yest ip) = oc HP = 1. 


= 1 a = 1 
B=smn7za=asin> 


This will be recognized as a familiar trigonometric identity if expressed in 
term of the angle by (3.22a, b). Since the form (3.22b) expresses the spinor R 
and therefore the rotation # as a function of the angle and axis of rotation 
represented by a vector a, it is appropriate to refer to it as the angular form or 
the angular parametrization of the rotation. The four parameters a, B, = o,:B 
fork = 1, 2,3 are called Euler parameters in the literature, so let us refer to a 
and £ respectively as the Euler scalar and Euler vector of the rotation. Other 
parametrizations which are useful for various special purposes are given in the 
exercises. 

In the canonical form for a rotation x’ = R'xR, it is obvious that the same 
rotation will result if we change the spinor R to its negative —R. To understand 
the significance of this ambiguity, use (3.22b) to write 
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—R = e #7 efl/2)ia = e(t/2)e(-a) (27-a) — ef ¥/2yia’ (3.24) 


We may interpret R as specifying a rotation in the righthanded sense about 
the ‘axis’ 4 through an angle a = | a| in the range 0 < a < 27. So (3.24) 
shows that —R specifies the rotation with opposite sense through the com- 
plimentary angle a’ = 27- a. Thus R and -R represent equivalent rotations 
with opposite senses as shown in Figure 3.4. The representation of a rotation 
as a linear transformation in the form R‘xR or as an orthogonal matrix does 
not distinguish between these two possibilities. Therefore, spinors provide a 
more general representation of rotations than orthogonal matrices. Specifi- 
cally, each unimodular spinor represents a unique oriented rotation, whereas 
each orthogonal matrix represents an unoriented rotation. Note that the 
vector B in (3.22a) specifies the oriented 
axis of the rotation directly. Also, from 
(3.23a) we see that a = (R), is always 
positive for rotations through angles less 
than 7 and always negative for their 
complementary rotations. So the ‘“‘short- 
est” of two complementary rotations is 
represented by the spinor with positive 
scalar part. 


Composition of Rotations 


The product (or composite) of a rotation 


Rx = RIxR, 


Fig. 3.4. Equivalent Rotations. 


with a rotation 

Tox = Rink 
is a linear transformation 

RX= RIxR, (3.25a) 
where 

Ky = Ke, (3.25b) 
and 

R,=R,R,. (3925¢) 


In Section 2-3 we proved that the product of two spinors produces a spinor. 
Furthermore, 


RIR, = (R,R,)'R,R, = RIRIRR, = 1, 


since we have assumed RIR, = RIR, = 1. Therefore R, is a unitary spinor 
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and X, is a rotation. This will provide us with a proof that the product of two 
rotations is always a rotation, once we have established that every rotation 
can be written in the canonical form we have used. 

Equations (3.25a, b, c) show that the problem of determining the rotation 
#, which is equivalent to the product of rotations %,%,, can be reduced to a 
straightforward computation of the geometric product of spinors. If the 
spinor equation (3.25c) is expressed in terms of rotation angles, it becomes 


plt2iay — plU2)iay g(1/2yiay (3.26) 


This equation determines the rotation angle a, in terms of angles a, and a,. 
The same equation is basic to spherical trigonometry (Appendix A). Indeed, 
the problem of determining the product of two rotations is mathematically 
equivalent to the problem of solving a spherical triangle. 

For computational purposes, it is more convenient to express (3.25c) in the 
Eulerian form 


a, + ip, = (a, + iB,) (a, + ip.) (3.27) 


rather than the angular form (3.26). Expanding the product in (3.27) and 
equating scalar and bivector parts separately, we obtain the following ex- 
pressions for the Euler parameters of R,; = a, + iB,: 


Ol, = A,0,— B,'B,, (3.28a) 
B, = a,B, + o,B,—B, X B,. (3.28b) 


These equations can be expressed as relations among rotation angles by using 
(3.23a, b), but the results are so complicated that it is clearly much easier to 
avoid angles and work directly with Euler parameters whenever possible. 
Note that (3.28b) gives us the rotation axis B, = 4, without requiring that we 
use angles. 


Example 


To illustrate the composition of rotations, let us compute the product of 
rotations by 90° about orthogonal axes, as described by the spinors 


Rae 2 cos + ie, sin = = a (1 + ie,), (3.29a) 
a il 

R, = eter’ = -——. (1 + i¢,) 3.29b 

: = ( 2) ( ) 


We can compute the product directly without using (3.28a, b). Thus, using 
d,0, = I0;, 


_ (1+i6,) (1+is,) 


jog Cy ee ie ae, 
E Vr Vr x[ (c, 2 | 
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= 4 + ia, va = pl '2)ia32n/3 (3.30) 
Thus, the composite rotation is by 27/3 = 120° about the ‘“‘diagonal axis”’ 
4, = (6, + o,-4,)/V3 . The reader can check this result by performing the 
rotations on some solid object, as shown in Figure 3.5. 

It is worth emphasizing that the product of two rotations is generally not 
commutative, that is, 7%, # *,Z,. This fact is perfectly expressed by the 
noncommutativity of the corresponding spinors, R,R, # R,R,. The result of 
performing the rotations specified by (3.29a, b) in both orders are illustrated 
in Figure 3.5. 


R, = a aad 


Fig. 3.5. Composition of Rotations. 
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The theory of rotations presented here was developed in the mid- 
nineteenth century by the mathematical physicist William Rowen Hamilton, 
who called it the quaternion theory of rotations. 

Our formulation differs from Hamilton’s in utilizing the entire geometric 
algebra ©. Hamilton employed only the quaternions, which we identify with 
the even subalgebra of Y,. Geometric algebra integrates quaternions with the 
rest of vector and matrix algebra and thus makes Hamilton’s powerful theory 
of rotations available without translation to a different mathematical lan- 
guage. Hamilton’s theory has been little used in this century just because its 
relations to conventional vector algebra was obscure. 

In this century Hamilton’s theory has been independently rediscovered in 
an equivalent matrix form called the spinor theory of rotations. The spinor 
theory is widely used by physicists in advanced quantum mechanics. We have 
preferred the term ‘‘spinor” to ‘‘quaternion” to call attention to the fact that 
our concept of spinor is equivalent to the concept of spinor in quantum 
mechanics. Proof of this equivalence will be given in a subsequent book, 
NFIL. 


Matrix Elements of a Rotation 


A rotation # transforms a standard frame {o, } into a new set of orthonor- 
mal vectors 


(i. = Re, = Rto,R = »» Ge jK- (3.31) 
J 


The matrix elements of the rotation are therefore given by 
Cp = 0; €, = 0;°(Koy) = (a,R'a,R)o- (3.32) 


This enables us to compute the matrix elements directly from the spinor R 
and express them in terms of any parameters used to parametrize R. For 
example, to express the matrix elements in terms of Euler parameters, we 
substitute (3.22a) into (3.32) to get 


Cik = oe 00; aP apoB oF aio(o,B ae Bo,))o- 


Evaluating the scalar parts in terms of inner products, we obtain the desired 
result 


Cx = (2° — 1)6, + 268, + 2a > Bieix) (3.33) 


where &, = — (ig;6;0,)) = 6;(a; X'o;). 


Equation (3.32) enables us to translate from spinor to matrix representa- 
tions for rotations. To translate the other way from given matrix elements a 
spinor, we need to solve (3.31) for k in terms of the e, and 6,. This can be 
done most easily by constructing the quaternion 
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T= > 6,e, = DD CnF/6,, (3.34) 
k jk 
which is uniquely determined by the o, and e,. According to (3.31), then, 
T = ~ o,R'a,R . 
k 


So our problem is to solve this equation for R as a function of T. To that end, 
notice that 


> 6,6, = 3 
k 


and 


~ o,Bo, = Pa 0,(2B-0, — 4,8) = 2B - 3B = -B 


Therefore, for R = a + iB, we have 
[= So (a—ip)o,R = (3a + ip)R = (4a— R')R = 4aR-1. 


Thus, 
4aR=1+T. (3:35) 


We can solve this equation for a = (R), by taking the scalar part or by 
computing the norm of both sides. Thus, we obtain 


l6e? = 41+ 7T),=|1+ TF. (3.36) 
So, if a + 0, we can solve (3.35) for R in the form 


1 
ees (3.37) 


ame | t+ Tole 

This enables us to compute R from the matrix elements e,, or any other 
specification of the e, and the o,, with the help of the definition of Tin (3.34). 
Unfortunately (3.37) is undefined for any rotation through an angle of 180°, in 
which case a = <R), = 0 and T =-1. To handle this case, we need an 
alternative parametrization of R in terms of the matrix elements. 

Let us see how much we can find out about R given the transformation of a 
single vector, say 6, to e,. From (3.31) we obtain 


Re, = oR, 


SO 


(a + iP)e, = o,(a + iB). 
Since we already have a general expression for a in (3.36), we seek to solve 
this equation for ~. Using the trick o,f = —Bo, + 2B-o, to reorder the 
geometric product, we obtain 
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iB(o, + e,) = 218, + a(o,—-e,). 
We solve this for B by using 


Gee Gh op Ch _ 4,-8@, 
3 3 mel. .) Sa aaa SS ee 
| o, +e, |? 20 ee) 
and 
ia, Xe 
o,—e,;) (6, + o,)' = —+—+— 
(@,-@,) (2, + ay? = BS 


Thus, we obtain 
B=(1+e,,)" [B,(¢, + @) ie 0, Xx e,] . (3.38) 


This gives us B from a, and e, provided we know £, and a. From (3.34) and 
(3.36) we have 


47 =1+¢e,, +6. +e,. (3.39) 
Using this in (3.33) we obtain 
48; = I+ €o-€), =x: (3.40) 


Unfortunately, (3.39) and (3.40) do not determine the correct sign of a 
relative to B,. However, this can be taken care of by using (3.33) again to get 


C12 — On 
4p; 


Equation (3.38) supplemented by (3.40) and (3.41) provides a practical 
means of computing R = a + if from the matrix elements though it is not so 
neat as (3.37). 

Equation (3.38) is singular only when e,, = —1, that is, when a, is rotated by 
180°. Of course, we get two similar equations by changing the subscripts from 
3 to 2 or | in (3.38). At least one of these three equations will be nonsingular 
for any rotation. For numerical purposes, the optimal choice among the three 
possibilities corresponds to the most positive among e,,, €2,, and e,,. This 
amounts to selecting from ¢,, ¢,, and o, the vector which is closest to the 
rotation axis. 

By deriving explicit formulas for calculating the spinor of a rotation from 
any given rotation matrix, we have proved, as a by product, the earlier 
assertion that every rotation R can be written in the canonical form Rx = R'xR. 
For we know that every rotation 1s determined by its matrix or the transform- 
ation of a standard basis. 

With geometric algebra at our disposal, it should be obvious that the spinor 
representation of rotations is superior to the matrix representation for both 
theoretical and computational purposes. Rotations are characterized so much 
more simply and directly by spinors than by matrices! Even in problems 


a= 


(3.41) 
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where a rotation matrix is given as part of the initial data, the most efficient 
way to use it is usually to convert it to an equivalent spinor using (3.37) or 
(3.38). This, for example, is the most efficient general method for finding the 
rotation axis and angle from the matrix elements; for, according to (3.22), the 
angle and axis can be read off directly from the spinor. In many problems an 
appropriate spinor can be written down directly or calculated from the given 
data without introducing matrices. It is best to avoid the matrix representa- 
tion of a rotation whenever possible, but we have developed the apparatus to 
handle it if necessary. 


Euler Angles 


We have considered several different parametrizations for spinors and ro- 
tations. Another parametrization which has been widely used by physicists 
and astronomers is specified by the equation 


R = ef V2GY o(V2VG A Q(/2V030 (3.42) 
Introducing the notation 

Roser Re eh, (3.43a) 

Ons ee. (3.43b) 
we can write the parametrized spinor R in the form 

R=R,Q,.R,. (3.43c) 


The scalar parameters wy, 6, g introduced in this way are called Euler angles. 
The advantage of using Euler angles is that every rotation is reduced to a 
product of rotations about fixed axes of a standard basis. it is especially easy, 
then to calculate the matrix elements of a rotation in terms of Euler angles. 
The rotation of a standard basis is given by 


¢f= Ko, = Rig R= ROR aR OpR,. (3.43) 
Consider the rotation of o,, for example. From (3.43b) we have 
Rio,R,= 45, 
and, since i¢,o0, = —9%, 
Ojc,0, = 0,03 = a,e" = a, cos @— a, sin 0. 
Therefore, 
e, = Ri, (a, cos 0 — a, sin 0)Ry 
= 6, cos 0 — a,c" sin 8 
= 6, cos 8 + (a, sin 0-2, cos d) sin 8. (3.45) 


From this the matrix elements %,, = o,-e, can be read off directly (Exercises 
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3.8). Figure 3.6 shows the effect on a standard basis of the successive rotations 
by Euler angles as specified by (3.42). 

From the final diagram in Figure 3.6, we can see a different way to describe 
rotations with Euler angles. We can read off the following spinor for the 
rotation directly from the diagram: 


0; 


07 
0; 
Q, 
0% R, 
on) 
GO; 
R = R,O,R, 
0; R, 
Oo, 
Line of nodes 
Fig. 3.6. Rotations determined by the Euler angles. 
R = e612 1O39 p(1/2)iN8 B12 yes | (3.46) 


This expression tells us that the net rotation can be achieved by a rotation of 
angle @ about the oa,-axis, followed by a rotation of angle @ about the 
so-called line of nodes, which has direction 
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a, Xe, 


n= Rio,R, = 6, Ry = oe"? = (3.47) 


| a, X e, | 
followed finally by a rotation of angle w about e,. 

Note that the order of Euler angles in (3.46) is opposite that in (3.42). 
Nevertheless, both expressions describe the same rotation; they show that the 
parametrization by Euler angles can be given two different geometrical 
interpretations. As we shall see in Chapter 8, the form (3.46) is preferred by 
astronomers, because 6, and e, can be associated with easily measured 
directions. On the other hand, (3.42) has the advantage of fixed rotation axes 
even for Euler anglcs changing with time. 

To prove algebraically that (3.46) is indeed equivalent to (3.42), note that 


ei2ving — R4QoRo 
and 
eli2iey — RUR,R = RIOLR OaR,: 


Substituting this into (3.46) and using the unitarity condition R'R = 1 for the 
various spinors, we get (3.42) as desired. The student should carry this step 
out to see how it works. 


Canonical Forms for Linear Operators 


We have found canonical forms for rotations and simple reflections in this 
section and for symmetric operators in the preceding section. These results 
determine canonical forms for all nonsingular linear operators, we can now 
easily show. 

To complete our characterization of orthogonal operators, we need canoni- 
cal forms for both proper and improper operators. The canonical form for a 
proper operator (i.e. rotation) is given by (3.15). The canonical form for an 
arbitrary improper orthogonal operator ¥ is determined by the following 
theorem: Let “be a simple reflection along any direction u, as expressed by 
the canonical form (3.6). Then, there is a unique rotation % such that 


F= RYU. (3.48) 
A canonical form for #is therefore determined by the canonical forms for “// 
and &%. The proof of (3.45) is easy. One simply uses the fact that “/” = | to 
write % = (.~U)U and so define X by X = £4. The fact that “is a rotation 
is proved by det *% = (det 4) (det “%/) = (-1) (-1) = 1. 
Canonical forms for an arbitrary nonsingular operators are determined by 
the Polar Decomposition Theorem: Every nonsingular tensor f has a unique 
decomposition of the form 


oe (3.49) 


where % is a rotation and <S and Yare positive symmetric tensors given by 
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S=(ff)? and s=(ff)!?. (3.50) 


A canonical form for f is therefore determined by the canonical forms for * 
and ©. 
To prove (3.49), first note that 


y(ffx) = (fy)-(fx) = x-(ffy) 
and 
Mi x) > 0 a x Sie 


Therefore, the operator ff is symmetric, so its square root «) specified by 
(3.50) is well defined and unique. Since <5 is nonsingular, we can solve (3.49) 
for the rotation. 


R= f5* =f (Ff) (3.51) 


It is easily verified that 7 is indeed a rotation, and the properties of the 
operator .#can be determined in much the same way as the properties of «). 

The eigenvalues and eigenvectors of <) describe basic structural features of 
f. They are sometimes called principal vectors and principal values of f to 
distinguish them from eigenvectors and eigenvalues of f. There is, of course, 
no distinction if f itself is symmetric. The principal values of <5 are always real 
numbers and, in general, they are not related in a simple way to the 
eigenvalues of f, some of which may be ‘‘complex numbers’. The polar 
decomposition theorem (3.49) tells us that the complex eigenvalues arise 
from rotations with the imaginary numbers corresponding to bivectors for 
planes of rotations, as we noted earlier in the special case of (2.14a, b). But 
we have seen that geometric algebra enables us to characterize rotations 
completely and effectively without reference to secular equations and com- 
plex eigenvalues. Thus, the polar decomposition enables us to characterize 
any linear transformation completely without introducing complex eigen- 
values and eigenvectors. 

The polar decomposition (3.49) provides us with a simple geometrical 
interpretation for any linear operator f. Consider the action of f on the points 
x of a 3-dimensional body of geometrical figure. According to (3.49), then, 
the body is first stretched and/or reflected along the principle directions of f. 
Then the distorted body is rotated through some angle specified by 7%. In 
contrast to the clear geometrical interpretation of principle directions and 
principle values, in conventional treatments of linear algebra complex eigen- 
vectors and eigenvalues do not have an evident interpretation. 

We shall not have the occasion to apply the polar decomposition theorem 
to physics problems in this book. The theorem has been discussed only to 
provide the student with a general perspective on the structure of linear 
operators. 
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5-3. Exercises 


GA) Show that the transformation Ax = u 'xu, determined by a nonzero 
vector u is a rotation. Find the axis, the angle and the spinor for this 
rotation. 

(Gr) Find the inverse of a reflection. 

(3:3) Prove that the product of three successive elementary reflections in 


orthogonal planes is an inversion, the linear transformation that 
reverses the direction of every vector. 
Explain how this fact made it easy to place mirrors on the moon 
which reflect laser signals from Earth back to their source. What 
precision measurements can be made with such signals? 

(3.4) A unitary spinor R can be given the follow parametrizations 


Seal) 

l-ib~ 

where the parameters a, B. y and b are all vectors. Establish the 
following relations among the parameters: 


lb 


Ree" re +p a1 + iy) = 


b = tanya. 
In the following problems 
x’ = % = RtxR, 


where R has the parametrizations just described. 
(35) Derive ‘Rodrigues formula” 


x’ —x = y X (x’ + x) 
(3.6) Establish the following “‘vector forms” for a rotation: 
x’ = x + 2aB X x + 2B X (B X x) 


=x+4Xxsina + 4X (a X x) (l-cosa). 


or) Derive the following expression for the matrix elements of a rota- 
tion by an arbitrary vector angle a: 


Cx = Oj COS A- Ejz,,G,, Sin a + aa, (1 — cos a), 


where &, = ito,aa,Ao,, and the a, = a-o, are ‘direction cosines” 
of the rotation axis. 
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(3.8) 


(3.9) 


(3.10) 


(3.11) 


Operators and Transformations 


Evaluate matrix elements for 


(a) A =<4;, 


(b)a= at o> — 6): 


20 C 
soe 
Evaluate the matrix elements of a rotation in terms of Euler angles 
to get the matrix 
cos y cos P- cos sin g@sin yw sin ycos P-—cos Fsin @cos yw sin Asin d 
cosy sing +cos@cos@psiny sin ysin @ + cos Ocos Pcos y —-sin Ocos P 


sin @ sin w sin @cos y cos 6 


Show that any spinor R can be written in the form 
1+ uv 
(2Gerwey) | 


where u and vy are unit vectors. Derive therefrom the trigonometric 
formulas 


R= + (uv)"* = 


1 te cosa 
[O55 2) — 
[2(1+cos a) }'” 
ae sin a 
sin >a = 


2(1+cos a) |’? 
Given that a rotation #(x) = R'xR has the properties 
K(aX b) =aXb, 
K(a) = b, 
show that 
1 + aad 
(20 eae ta 
For the composition of rotations described by the spinor equation 


Fae cs = R, 


+R= 


where 
R, = 7™ = a, (1 + 1) '. 
derive the “law of tangents” 


Vit ote 


tan a, = 7, = {ee 
1 #2 
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(3.12) |The sum of the diagonal matrix elements f,, of a linear transform- 
ation f is called the trace of f and denoted by Tr f. Show that the 
trace of a rotation % is given by 


Tr R= 2 o,'(Ko,) = 1+2 cosa, 


where a is the vector angle of rotation. 


(3.13) (a) Prove that every reflection is symmetric transformation. 

(b) Under what condition will the product of two reflections be a 
symmetric transformation? 

(c) Prove that every symmetric transformation can be expressed as 
the product of a symmetric orthogonal transformation and a 
positive symmetric transformation. 

(3.14) | Explain how results developed in the text can be used to prove 
Hamilton’s theorem: Every rotation can be expressed as a product 
of two elementary reflections. 

(3.15) Find the polar decompositon of the skewsymmetric transformation 


Rx = x-A. 
(3.16) The linear transformation 
fx =x + 2a6,0.,,°x 


is called a shear. Draw a diagram (similar to Figure 2.1) showing the 
effect of f on a rectangle in the o,¢, - plane. Find the eigenvectors, 
eigenvalues, principle vectors and principle values of f in this plane. 
Determine also the angle of rotation in the polar decomposition of f. 


5-4. Transformation Groups 


So far in this chapter we have concentrated our attention on properties of 
individual linear transformations. However, in physical applications trans- 
formations often arise in families. For example, the change of a physical system 
from one state to another may be described by a transformation, so the 
set of all changes in physical state is a family of transformations. If the 
changes are reversible, then this family has the general structure of a math- 
ematical group. Transformation groups are so common and significant in 
physics that they deserve to be studied systematically in their own right. 

As there is a great variety of different groups, it will be conceptually 
efficient to begin with the abstract definition of a mathematical group de- 
scribing the common properties of all groups. Then we shall examine the 
structure of specific groups of particular importance in physics. 
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An abstract group is a set of elements 4“, 7/.. . . interrelated by a binary 
function called the group product with the following properties: 

(1) Closure: To every ordered pair of group elements (©, 7/) the group 
product associates a unique group element denoted by “7. 

(2) Associativity: For any three group elements (7) 7/= 7(&//). 

(3) Identity: There is a unique identity element « in the group with the 
property GY = “& for every element ©. 

(4) Inverse: To every element “ there corresponds a unique element “ 
such that 


Gig = €. 


1 


The Rotation Group * (3) 


From the properties of rotations discussed in Section 5-3, one can easily show 
that the set of all rotations on Euclidian 3-space ¢’, forms a group for which 
the group product is the composition of two rotations. This group is called the 
rotation group on ¢, and denoted by (/'(3). The rotation group deserves 
detailed study, first, because it is the most common and generally useful 
group in physics, and, second, because it exhibits most of the interesting 
properties of groups in general. After one has become familiar with specific 
properties of the rotation group, other groups can be efficiently analyzed by 
comparing them with the rotation group. 

As we have noted before, by selecting a standard basis {o6,} we can 
associate with each rotation “a unique matrix with matrix elements 


Vin = 6;(Ko;,). (4.1) 


These matrices form a group for which the matrix product is the group 
product. This group is called a matrix representation of the rotation group. 
These two groups are isomorphic. Two groups are said to be isomorphic if 
their elements and group products are in one-to-one correspondence. The 
isomorphism between the rotation group and its matrix representation deter- 
mined by Equation (4.1) is shown in Table 4.1. 

Besides the matrix representation there is another important represen- 
tation of the rotation group. As shown in Section 5-3, the equation 


Reo, = R'6,R = 67; (4.2) 


determines a correspondence between each rotation 7% on ¢’, with matrix [r,,| 
and a pair of unitary spinors + R. The unitary spinors form a group which we 
dub the dirotation group and denote by 2(/'(3), though this nomenclature is 
not standard. For this group the geometric product is the group product. The 
dirotation group is commonly referred to as the spin-; representation of the 
rotation group. The two-to-one correspondence of elements in 2(/'(3) with 
elements in ()*(3) is called a homomorphism. Table 4.1 shows the correspon- 
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TABLE 4,1, Homomorphism of the rotation group (/' (3) with its matrix 
representation and the group of unitary spinors 21! (3) 


Dirotation Rotation Matrix 
group group representation 
2 to | tol 
Elements asin BEN. oo ol S==—<——> 17, Ov na ah Sa {ieee ot of 
Closure SR = T =———>>_— os i= 7 > Dripsjik = “ik 
} 


Associativity S(RQ) = (SR)Q<——> (V2R)S = WKS) ——— P (> Fi" jR Ser 
he] 
= Dida > rikSk) 
; 
/ 


Inverse RR=1 eet) — OS Aiyjrin = i 
I 


dence between elements and operations of these groups. In general, a 
homomorphism is many-to-one correspondence between groups, and the 
special case of a one-to-one correspondence is called an isomorphism. 

Homomorphic groups can be regarded as different ways of representing the 
same system of mathematical relations. Mathematical groups derive physical 
significance from correspondences with actual (or imagined) groups of oper- 
ations on physical systems. The displacement of a solid body with one point 
fixed is a physical rotation, and the set of all such displacements is the physical 
rotation group. This group can be represented mathematically by any one of 
the three groups in Table 4.1, a group of linear transformations (the math- 
ematical rotation group (/'(3)), a group of matrices, or the group of unitary 
spinors 2(/'(3). From our experience in Section 5-3, we know that for 
computational purposes the spinor group is the most convenient represen- 
tation of the physical rotation group, so we shall make great use of it. 

Table 4.1 gives three different mathematical representations of the group 
product for the rotation group. There are many others. For example, equa- 
tion (3.28b) gives a representation of the group product in the form 


P(B, , B.) = (1 —B5)'°B, + (1-85) BL - BX B:. (4.3) 


where £, and f, are vectors with magnitude less than one which determine the 
axis and angle of rotation. The group product has been written in the 
functional from @(8,, B.) so as not to confuse it with the geometric product 
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B, B, of vectors. With this notation the properties of the group product take 
the form: 


Closure: 

(B,, B.) = B;, . (4.4a) 
Associativity: 

P(P(B,. B:), Bs) = P(B:, P(B:, B3)). (4.4b) 
Identity: 

p(0,p) =B, (4.4c) 
Inverse: 

p(-B, B) = 0. (4.4d) 


The set {B} of all vectors in the unit ball | B | < 1 is a group under the product 
(4.3). According to (4.4c), the identity element in this group is the zero 
vector, and, according to (4.4d), the negative of a vector in the group is its 
“group inverse’. The reader can verify by substitution that the product $(£, , 
B.) defined by (4.3) has the group properties (4.4a, b, c, d). However, from 
the deviation of (4.3) in Section 5-3 we know without calculation that the 
group properties must be satisfied, because the equation 


R = (1-B?)"? + ip (4.5) 


determines homomorphisms of 2(/'(3) and (/'(3) with the group of vectors. 
Each vector B = sin }a in the unit ball determines a clockwise rotation 
through an angle |a; about an axis with direction B = a. Note that B and -B 
determine equivalent rotations when | B | = 1. 

Properties of the rotation group can be established by establishing corre- 
sponding properties for any of the groups homomorphic to it. As a rule it will 
be most convenient to work with the spinor group and its various parametri- 
zations such as (4.5) or the parametrization by angle 


R = ea, (4.6) 


According to (4.6), every spinor R is a continuous function of the ‘angle 
vector” a. Since a can varied continuously to the value 0, every spinor is 
continuously connected to the identity element 
e209 — 4. : 

Because of this property, 2(/*(3) is said to be a continuous group. It follows 
that the homomorphic rotation group (/'(3) is also a continuous group. The 
continuity property makes it possible to differentiate the elements of a 
continuous group. In Section 5-6 we shall see how to differentiate a rotation 
by reducing it to the derivative of the corresponding spinor. 

If we keep the direction 4 = n fixed and allow the magnitude a = |a| of the 
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angle vector in (4.6) to range over the values 0 < a < 27, then we get a group 
of spinors with the parametric form 


R = e(20ina (4.7) 


showing that they have a common axis n in ¢’,. This is the spinor group 
2U* (2), the dirotation group in the Euclidean plane ¢’,. According to (5.7) all 
group elements are determined by the values of a single scalar parameter a, so 
2U'(2) and U"(2) are said to be l-parameter groups. 2 (/'(3) and ('(3) are 
3-parameter groups because every element can be specified by the values of 
three scalar parametc.s, for example, the three components of the vector a in 
(4.6) relative to a standard basis, or the values of the three Euler angles in 
Equation (3.34). 

2U*(2) is a 1-parameter subgroup of 2(/*(3) while (/'(2) is a subgroup of 
U*(3). A subgroup of a group is a subset of group elements which is closed 
under the group product, so it is itself a group. Obviously, *(3) contains 
infinitely many 1-parameter subgroups of the type /"(2), one such subgroup 
for each distinct axis of rotation. 


The Orthogonal Group (3) 


In Section 5-3 we saw that there are two kinds of orthogonal transformations, 
proper and improper. We have seen that the proper transformations (ro- 
tations) from a group U*(3). The improper transformations do not form a 
group, because the product of two improper transformations is a proper one. 
However, this shows that the set of orthogonal transformations on €, is closed 
under composition, so it is a group. This group is called the orthogonal group 
(3). The rotation group U*(3) is obviously a subgroup of U(3). 
According to Section 5-3, every rotation has the canonical form 


KX = RIER (4.8) 


where R is a spinor, that is, an even multivector. On the other hand, every 
improper orthogonal transformation as the form 


Rx =—-R'KR (4.9) 
where R is an odd multivector. In both cases 

RiR=1. (4.10) 
Equations (4.8) and (4.9) can be combined into a single equation 

Rx =RxR, (4.11) 
where 

R=R' ifRiseven, (4.12a) 


R=-R' if Ris odd. (4.12b) 
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Thus, (4.11) subject to (4.10) is the canonical form for any element of the 
orthogonal group. 

In order to refer to the multivector R in (4.1la) as a spinor, we must 
enlarge our concept of spinor. We can then distinguish two distinct kinds of 
spinor, a proper (or even) spinor satisfying (4:12a) or an improper (or odd) 
spinor satisfying (4.12b). These spinors form the diorthogonal group 2((3) 
which, by (4.11), is two-to-one homomorphic to the orthogonal group (/(3). 
Obviously, the even spinors form the subgroup 2(/'(3) homomorphic to the 
rotation group (/'(3). Notice also that odd spinors cannot be continuously 
connected to the identity element |, because | is even; hence, 2/3) is not a 
continuous group. It does, however, contain two connected subsets, namely, 
the even and the odd spinors. Likewise, (3) is not a continuous group, 
though it consists of two continuous subsets, the proper and the improper 
orthogonal transformations. Only one of these two subsets is a subgroup. 

The complete orthogonal group (3) does not play so important a role in 
dynamics as the subgroup of rotations U* (3), because any physical transform- 
ation of a rigid body must be continuously connected to the identity trans- 
formation. However, improper orthogonal transformations are needed for a 
full description of the symmetries of a physical system, as we shall see in 
Section. 5-5. 


The Translation Groups 


A translation / on ¢, is a transformation of each point x in ¢, to another 
point 


TX=xX+a. (4.13) 


This equation is mathematically defined for all points in ¢,, but in physical 
applications we shall usually be concerned only with applying it to points 
which designate positions of physical particles. Thus, we can regard (4.13) as 
describing a shift a in the position of each particle in 
a physical object, as shown in Figure 4.1. We have 
already made similar interpretations of equations 
describing rotations without saying so explicitly. It 
should be clear from the context when such an 
interpretation is made in the future. 

Although translations are point transformation 
and they transform straight lines into straight lines, 
in contrast to rotations, they are not linear trans- 
formations. However, the translations do form a_ Fi8- 4.1. Translation by a of a 

alae : physical object. 
group, so it is convenient to use the operator no- 
tation we have adopted for groups as well as linear transformations, dropping 
parentheses to write 7%, x in (4.13) instead of 7,(x). The main reason for this 
convention is the simplicity it gives to the associative rule 
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BATS) om (ae) oe, 


This rule and the other group properties follow directly from the definition of 
a translation by (4.12). In addition, from (4.12) it follows that 


ie oe. (4.14) 


Thus, all translations commute with one another. For this reason, the trans- 
lation group is said to be a commutative group. The commutivity of trans- 
lations is clearly a consequence of the commutativity of vector addition. 
Indeed, it is readily verified that the function ¢(a, b) = a+ b has the 
properties (4.4a, b, c, d) of a group product. Hence, the vectors of ¢’, form a 
group under additiun which is isomorphic to the translation group on ¢’,. 


The Euclidean Group 


An isometry of Euclidean space ¢’, is a point transformation of ¢’, onto ¢, 
which leaves the distance between every pair of points invariant. Thus, if fis 
an isometry taking each point x into a point 


x’ = f(x), (4.15) 
then for every pair of points x and y we have 
(eyeing). (4.16) 


The condition (4.16) tells us that there is a transformation 7 of each vector 
x — y to a vector 


x - y=) =f) = Ay) (4.17) 
with the same length as x — y. What kind of a function is 7? 
Since (4.17) must apply to every pair of vectors, we have 
Yi = 2 ky 2) 
which, when added to (4.17), gives 
x’ — 2’ = Rx-y) + Aly—z) = Ax-z). 
Setting y = 0 in this expression, we find that “ has the distributive property 


F(x — 2) = F(x) + H(-z). (4.18) 
If we can prove also that 
Flax) = aK(x) (4.19) 


for any scalar a, then we will know that 7% is necessarily a linear transform- 
ation. Moreover, “ must be an orthogonal transformation, because it leaves 
the length of vectors unchanged. 

Equation (4.19) can be proved in the following way. Setting -z = x in 
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(4.18), we get A(2x) = 2%(x). Then setting -z = 2x in (4.18), we get 
73x) = 3%(x). Continuing in this way we establish 


F(mx) = mx) 


where m is any integer. If x = ny where n is a nonzero integer, then 
a{ x = Kmy) = mXXy) = Ax). 
I 


which proves that (4.19) holds when @ is a rational number. Since any real 

number can be approximated to arbitrary accuracy by a rational number, it 

follows from the continuity of 7 that (4.19) must hold for any real number. 

Since { #(x) |? = x° is a continuous function of x, the function A(x) must be 

continuous. This completes the proof that “is a linear transformation. 
Now, setting y = 0 in (4.17) and writing f(0) = a, we have 


fis) = Fx + a=TIx. (4.20) 


Thus, we have proved that every isometry of Euclidean space is the product 
of an orthogonal transformation and a translation. Using (4.11), we can write 
(4.20) in the form 


{Rla}\x = 7, 7% =RxR+a. (4.21) 


The new notation {R | a} has been introduced to indicate that each isometry 
is uniquely determined by a spinor R and a vector a. In this notation 


R = { R| 0} (4.22) 
denotes an orthogonal transformation, and 

%, = {1| a} (4.23) 
denotes a translation. Note that 

{R|a} = {-Rla}, (4.24) 


because of the 2-to-1 homomorphism between spinors and orthogonal trans- 
formations. The right side of (4.21) reduces an isometry to multiplication and 
addition in geometric algebra. 

The isometries of Euclidean space form a group. The student can verify the 
following group properties. The group product is given by 


{S\b} {Rla} = {RS|SaS + b}.. (4.25) 
The identity element is 
{1|0} = {-1]0}. (4.26) 


The inverse of an isometry is given by 


{Rla}" = {R\- RaR}. (4.27) 
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The most significant result here is that computations of composite isometries 
can be carried out explicitly with (4.25), without referring to (4.21). 

Isometries describing displacements of a rigid body are especially import- 
ant. A rigid body is a system of particles with fixed distances from one 
another, so every displacement of a rigid body must be an isometry. But a 
finite rigid body displacement must unfold continuously, so it must be con- 
tinuously connected to the identity. From our discussion of the orthogonal 
group, it should be evident that only isometries composed of a rotation and a 
translation have this property. An isometry of this kind is called a rigid 
displacement. Of course a body need not be rigid to undergo a rigid displace- 
ment (see Figure 4.2). 

The set of all rigid displace- 
ments is a continuous group 
called the Euclidean Group. 
This group underlies the 
geometrical concept of congru- 
ence. Two figures are said to be 
congruent if one can be super- 
imposed on the other by a rigid 
displacement. Thus the Eucli- 
dean Group describes all poss- 
ible relations of congruency. 
These relations underlie all 
physical measurements. A 

0 ruler is a rigid body, and any 
Fig. 4.2. A rigid displacement is the composite of a measurement of length in- 


rotation and a translation. (The translation vector a need volves rigid displacements to 
not be in the plane of rotation). compare a ruler with the object 


being measured. 

Insight into the structure of the Euclidean group can be developed by 
examining specific properties of the rigid displacements. We have proved that 
any rigid displacement can be put in the canonical form 

CRialxi= 7, ASR xR a. (4.28) 
Note that the rotation here is about an axis through the origin, so the origin is 
a distinguished point in this representation of a rigid displacement. But the 
choice of origin was completely arbitrary in our derivation of (4.28), so 
different choices of origin give different decompositions of a rigid displace- 
ment into a rotation and a translation. Let us see how they are related. 

Let %, denote a rotation about a point b. This rotation can be expressed in 
the notation of (4.28) by using 7, = %,' to shift the point b to the orgin, 
performing the rotation # about the origin and finally using 4 to shift the 
origin back to the point b. With the help of (4.21) we obtain 


R, = R= {R\b- RbR}. (4.29) 
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The rotation axis of %, is the set of all points invariant under R,, that is, the 
points x satisfying the equation 


RyX =X, (4.30a) 
or equivalently, 
Rt(x—b) R+b=x. (4.30b) 


If 7%, is not the identity transformation, Equation (4.30a or b) determines a 
straight line passing through the point b. The rotations %, and 7 = X%, rotate 
points through equal angles about parallel axes passing through the points b 
and 0 respectively. The vector b can be decomposed into a component b, 
parallel to the rotation axis and a component b, perpendicular to it, as 
described by the equation 


R'bR = R'(b, + b,)R=b, + b, RR’. (4.31) 
Substitution of (4.30) into (4.29) gives 
R, = {Rlb,(1-R?)}. (4.32) 


If b, = 0, then b = b, and (4.32) reduces to 
Ry, = {R\0} = R. 


Thus, rotations differing only by a shift of origin along the axis of rotation are 
equivalent. 

The vector b, (1 — R°) is perpendicular to the axis of rotation determined 
by R. We can conclude from (4.3), therefore, that a rigid displacement {R | a} 
is a rotation if and only if aR = R'a, that is, if and only if the translation 
vector a = a, is perpendicular to the axis of rotation. Moreover, a fixed point 
b, of a rotation {R | a} is determined by the equation 


a,=b,(1-R’). 
Whence, 
b,=a,(1-R’)'. (4.33) 


For rotation by an angle @ about an axis with direction n, the spinor has the 
form 


R = eine | 
When this is substituted into (4.33) a little calculation gives 
b, = 4a, (1+ moot $) =4(a, + ax acord oa (4.34) 
Since a, is perpendicular to the axis of rotation, the transformation {R | a 
leaves every plane perpendicular to the rotation axis invariant, and it consists 


of a rotation-translation in each plane. Therefore, we have proved that every 
rotation-translation {R | a, } in a plane is equivalent to rotation centered at 
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the point b, specified by (4.34), as shown in Figure 4.3. Our proof fails in the 
case R* = 1, for then (4.33) is not defined. In this case we have a pure 
translation {1 | a}. Hence, we have 
proved that every rigid displacement in 
a plane is either a rotation or a trans- 
lation. This, of course, is just a bypro- 
duct of our general results on rotations 
in three dimensions. 

Having determined how rotations 
about different points are related, we 
are equipped to choose a center of 
rotation yielding the simplest possible 
decomposition of a rigid displacement 
into ‘a rotation and a_ translation. 
Given a general rigid displacement 
{R | a}, we decompose a into compo- 
nents a, and a, parallel and perpen- 
dicular to the “‘axis” of the spinor R, so 
that 


{Ria} = {Rla, + a} 
{1]a)} {Rla,}. 


By comparison with (4.32), the last factor in (4.35) can be identified as e 
rotation 

Ry, = {Rlay}, (4.36) 
where the center of rotation b, is given by (4.33) or (4.34). Therefore, 
Equation (4.35) can be written 

{R\a} =, Ke ‘ (4.37) 


Fig. 4.3. Equivalence of a rotation-trans- 
lation in a plane to a pure rotation. 


/ = {] |b} is a translation parallel to the rotation axis of 7,. This 
result proves the theorem of Chasles (1830): Any rigid displacement can be 
expressed as a screw displacement. A screw displacement consists of the 
product of a rotation with a translation parallel to or, if you will, along the 
axis of rotation (the screw axis). We have done more than prove Chasles’ 
theorem; we have shown how to find the screw axis for a given rigid 
displacement. Our result shows that the screw axis is a unique line in ¢,, 
except when the rotation is the identity transformation so the screw displace- 
ment reduces to a pure translation. 

In spite of the uniqueness and simplicity of the representation of a rigid 
displacement as a crew displacement, no one has shown that it has any great 
practical advantages, so it is seldom used. The representation {R | a} is 
generally more useful: because the center of rotation (the origin) can be 
specified at will to simplify the problem at hand. 


where / 
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5.4. Exercises 


(4.1) Prove that the translations satisfy each of the four group properties. 

(4.2) Derive Equation (4.25) and prove that the isometries of Euclidean 
space form a group. 

(4.3) Prove that any rigid displacement with a fixed point is a rotation. 

(4.4) Prove that rotations with parallel axes do not generally commute 
unless the axes coincide. 

(4.5) Derive Equation (4.33) from Equation (4.32). 

(4.6) A rigid displacement {R | a} can be expressed as the product of a 
translation 7 and a rotation “%, centered at a specified point b, 1.e. 


{Ria} = 7%. 


Determine the translation vector c. 

(4.7) A subgroup {7} of a group {@} is said to be an invariant subgroup 
if G°TG is in {7} for each 7in {7} and every Gin {9}. 
Prove that the translations comprise an invariant subgroup of the 
Euclidean isometry group. 

(4.8) Let S denote the reflection along a (non-zero) vector a. if 7 is the 
translation by a, then dS, = %<S/7,' 1s the reflection < shifted to the 
point a. Show that 


SS a) = his 


Thus, a translation by a can be expressed as a product of reflections 
in parallel planes separated by a directance +a. 


5-5. Rigid Motions and Frames of References 


Having determined a general mathematical form for rigid displacements in 
Section. 5-4, we are prepared to develop a mathematical description of rigid 
motions. 

Let x = x(t) designate the position of a particle in a rigid body at time f. 
According to (4.21), a rigid displacement 7, of the body from an initial 
position at time ¢ = 0 to a position at time ¢ is described by the equation 


x(t) = D,x(0) = #%x(0) + alt) 
= R'(t)x(0)R(t) + a(t). (5.1) 


This gives the displacement of each particle in the body as x(0) ranges over 
the initial positions of particles in the body. The displacement operator /, is 
the same for all particles. Regarded as a function of time, 2, is a 1-parameter 
family of displacement operators, one operator for each time. A rigid motion 
is |-parameter family of rigid displacements, described by a time-dependent 
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displacement operator 2,. Since the path x = x(t) of a material particle is a 
differentiable function of time the displacement operator 2 = /, must also be 
differentiable function of time. Our next task is to compute the derivative of a 
displacement operator. 


Kinematics of Rigid Motions 


It will usually be convenient to suppress the time variable and write (5.1) in 
the form 


k= 0x, = Kx, ta=R'x,R +a. (5.2) 
If x and y designate the positions of two particles in the body, then 
X—y = Rx,— RYy = A(X — Yo) = Rt Gey: (5.3) 


So time-dependence of the relative position r = x — y of two particles in the 
body is 


r= Fr, =R'r,R. (5.4) 


Since the rotation op- 
erator “A is indepen- 
dent of the choice of 
particles, it follows 
that the motion of a 
rigid body relative to 
any of its particles is a 
rotation. (Euler’s The- 
orem). This is_ illus- 
trated in Figure 5.1. 

Equation (5.4) ex- 
presses the rotation 
operator % in terms of a spinor R. This enables us to compute the derivative 
of a rotation from the derivative of a spinor. To carry this out we need to 
prove that the derivative R of a unitary spinor R has the form 


Fig. 5.1. Rigid motion. 


where 92 = iw is a bivector dual the vector w. The bivector property of 2 
implies that 
Q' =-N=-io. (5.6) 


By appealing to the definition of reversion and the derivative, one can easily 
prove that 


Rt = (Rt = (oy (5.7) 
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which says the operations of reversion and differentiation commute. Conse- 
quently, we can get the derivative of R’ from (5.5) by reversion, with the 
result 


Rt =-+oRt =-tioR'. (5.8) 
The unitarity of R is expressed by the equation 
R'R=1. (5.9) 
Using this property, we can solve (5.5) for 
QDR. (5.10) 


This can be taken as the definition of 822, so all we need to prove is that Q is 
necessarily a bivector. Differentiation of (5.9) gives 


RR 4 RR = 0, 
Because of (5.7), then, 
R=OR = kh R= [CR = |, 


which proves (5.6). The property Q' = ~Q implies that Q cannot have a 
scalar and vector parts, leaving the possibility that 2 has nonvanishing 
bivector and pseudoscalar parts. On the other hand, the requirement that the 
spinor R must be an even multivector implies that R and R' are even, so, by 
(5.7), 8 must be even, and it cannot have a nonvanishing pseudoscalar part. 
This completes our proof that Q is a bivector. 

Now we can evaluate the derivative 7. Differentiating (5.4) and using (5.5) 
and (5.8), we have 


r= Ar, = Rtr,R + Rir.R = Rr, RQ-+QR'r,R 
+(rQ ~ Qr) = ti(rw - or). 


Hence, 

r=rQ=oxr, for U0] 
or, in terms of the rotation operator % and its derivative %, 

Rr, = (Rr,)'2 = w X (Kr). (5.12) 
Equation (5.12) shows that Ris a linear operator, for it is the composite of 7 
and the shewsymmetric linear function r-'Q = » X r. 

We will refer to the time dependent vector w = w(t), or the equivalent 


bivector 82 = Q(t), as the rotational velocity of the time-dependent spinor 
R = R(t) and the family of rotations it determines. The alternative term 
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“angular velocity” is common, but it is misleading, because it suggests that w 
is the derivative of the rotation angle, and this is true only if the direction of 
the rotation axis is time-independent (Exercise (5.3)). So let us use the term 
angular velocity only when the rotation axis is fixed. When the rotational 
velocity is a known function of time, the spinor equation (5.5) is a well- 
defined differential equation, which can be integrated to find the time- 
dependence of the spinor describing the rotational motion. We have, in fact, 
encountered this equation already in Section 3-6, and we know that when 
@ = (0 it has the elementary solution 


R = ef l2)ier 


Equation (5.5) is a kinematical equation describing the rotational motion of a 
rigid body. In Chapter 7 we will study the dynamical equation describing the 
influence of forces on a rigid body and use (5.5) to determine the resulting 
rotational motion. 

With the Equation (5.12) for the derivative of a rotation at our disposal, we 
can ascertain the functional form of the derivative of a general rigid motion 
(5.2). Differentiating (5.2), we have 


Ox, = ee, bao (Ae) a. 
Therefore, in terms of w or 2, we have 

x = @ X (x-—a) + a= (x-a)'Q+4. (523) 
This is an equation of the form 

x = v(x, #), 


where v(x, ¢) is a time-dependent vector field giving the velocity at time ft of a 
particle in the rigid body located at any point x. The vector a = a(r) desig- 
nates the center of rotation for the rigid motion, so it is natural to refer to a as 
the translational velocity. If we are given the translational velocity a and the 
rotational velocity as functions of time, then the rigid motion can be 
determined by direct integration. 


Reference Frames 


We have been using the concept of position without defining it fully. For 
applications it is essential that we make its meaning more explicit. 

The position of a particle is a relation of the particle to some rigid body 
called a reference body or reference frame. The position of a particle with 
respect to a given reference frame at a specified time is represented by a 
position vector x in a reference system attached to (associated with) the 
frame. The set of all possible position vectors {x} is called the position space 
of the reference system or reference frame. 

Often the reference body presumed in a physical application is not men- 
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tioned explicitly. For example, in Figure 5.1 illustrating the displacement of a 
rigid body, the paper on which the figure is drawn is the tacitly assumed 
reference body. The rigid body illustrated is displaced relative to the paper. A 
rigid body cannot be displaced in relation to itself. A displacement is a change 
in the relation to a reference body — a change: of position. 

The position space of a reference frame is a 3-dimensional Euclidean space. 
The points (vectors) of a position space are ‘‘rigidly related” to one another 
like the particles of a rigid body. Furthermore, at any given time the points in 
one reference system can be put in coincidence with the points in any other 
reference system by a rigid displacement. This is a consequence of our 
theorem that the most general transformation leaving the distance between 
points invariant is a rigid displacement. Thus, the points (position vectors) 
{x} in one reference system called the unprimed system are related to the 
points {x’} in another reference system called the primed system by a 
transformation of the form 


x’ = Feta. (5.14) 


If x designates the position of a particle in the unprimed system, then (5.14) 
determines the position x’ of the particle in the unprimed system. Thus, the 
vectors x and x’ in (5.14) designate the same physical place in relation to two 
different reference bodies. The vector a locates the origin of the primed 
system in the unprimed system. 

The reference bodies may be moving relative to one another so in general 
the transformation (5.14) is time-dependent. The rotational velocity 


w' = Rw = RoR' (5.15) 


of the unprimed frame (or reference body) relative to the primed frame (or 
reference body) is defined by 


; = +Rio' =+tioR. (5.16) 
From this the derivative of the rotation #x = R'xR is found to be 
Rx = Hw X x) = (Rw) X (Fx) = w! X (x! —x). Gaz) 


Note that equation (5.15) 1s actually superfluous, because it is a consequence 
of (5.16). It has been written down to emphasize that there are two equivalent 
ways to represent the angular velocity, either as a vector w in the unprimed 
system or as a vector w’ in the primed system. Only one of these vectors is 
needed, and the choice is a matter of convenience. We will use w rather than 
w’ in order to express the rotational kinematics in the unprimed system. 

Now suppose that x = x(¢) is the trajectory of a particle in the unprimed 
system. Substitution of x = x(f) into (5.14) gives the corresponding trajectory 
x’ = x'(t) in the primed system. The relation between velocities in the two 
frames is then determined by differentiating (5.14); thus, 


Rigid Motions and Frames of Reference 311 
x = Kx + Ket a, 
By (5.19), therefore, 


x’ = Ax+oXx)+a (5.18) 
If the bivector 2 = iw preferred, this takes the form 
= A(x txQ)t+a. (5.19) 


If, as “‘initial conditions” on the rotation operator # = %, and the translation 
vector a = a(f), it is required that 


x’ = xattimet, (S220) 
then (5.14) implies that # = 1 at time ¢, and (5.18) becomes 

= Xo x x a G62) 
This equation is commonly found in physics books without mention of the fact 
that it can hold only at a single time. It is not a differential equation which can 
be integrated, nor can it be differentiated to find the relation between 


accelerations. 
To relate accelerations in the two frames we differentiate (5.18); thus 


k= AK +OXx)+ KK+OXx+oxXx*4+4. 
So, using (5.17), we get the desired result 

X= KR(X+20Xx+0X%(wXx)+@Xx)+4. 2) 
Equivalently, in terms of 2 = iw, we have 

K = AK + 2x2 + (x-Q)-Q+xQ) +4. (5:23) 


This is the general relation between the accelerations of a particle with 
respect to two reference frames with arbitrary relative motion. 


Intertial Systems 


Among the possible reference systems, inertial systems are especially signifi- 
gant. An inertial system is distinguished by the property that within the system 
the equation of motion of a free particle is 


x = 0. 


The frame to which an inertial system is ‘‘attached’ is called an inertial frame. 
Thus, with respect to an inertial frame every free particle moves in a straight 
line with constant speed. The transformation from one inertial system to 
another is determined by (5.21). We simply require that x’ = 0 when x = 0) 
for any position x or velocity x of a free particle. This condition can be met in 
(5.21) only if a = O and a = 0, that is, only if the two frames are moving with 
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a constant relative velocity and without a time-dependent relative rotation. 
If the primed system is inertial, then Newton’s law of motion for a particle 
of mass m has the usual form 


mx’ =f", (5.24) 


where f’ is the applied force. Although we have not mentioned it before, 
Newton's law has the form (5.24) only in an inertial system. By substituting 
(5.22) in (5.24) we get the modified form of Newton’s law in an arbitrary 
system namely 


m(x+20Xx+oX(oxXx)+oxXx+ H'a) =f. (5.25) 
where 
fee. (5.26) 


The additional terms in (5.25) arise from the motion of the reference body for 
the unprimed system. The term w X (w X x) is called the centripetal (center- 
seeking) acceleration; as Figure 5.2 shows, it is always directed toward the 
axis of rotation. The term 2@ X x is called the Coriolis acceleration. 
The other terms do not have generally accepted names. 

Of course, the terms in (5.25) can be rearranged to get an equation of 
motion formally in the Newtonian form 


MX = fer, (5.27a) 
where the effective force is given by 
f.g =f-m(o X (wXx)+2o0Xx+@Xx+ #'a). (5.27b) 


The “fictitious force” —-mq@ X (@X x) is called the centrifugal (center- 
fleeing) force, and -2m@ X x is called the Coriolis force. 

Equation (5.27a) for motion in a noninertial system is as well-defined and 
solvable as Newton's equation for motion in an inertial system. However, the 
“fictitious forces” in (5.27b) can in prin- 
ciple be distinguished from the “real 
force” f by virtue of their particular 
functional dependence on m, x and x. In 
practice, w can be measured directly by 
observing rotation of the frame relative 
to the “fixed stars’. Ideally, fictitious 
forces can be measured by observing 
accelerations of free particles with re- 
spect to a noninertial frame, though this 
is seldom practical. Considerations of 
practicality aside, the point is that a real 
force (field) is distinguished from a ficti- 
tious force by the fact thai it depends on Fig. 5.2. Illustrating centripetal acceleration. 


wx x 
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the presence of material bodies to produce it, and it cannot be “transformed 
away’ in a finite region of space by a change of frame. On the basis of this 
distinction, then, it can be said that Equation (5.27a) does not have the form 
of Newton’s law unless w = 0 and a = 0 in (5.27b). 

Einstein observed that a uniform gravitational force field f = mg can be 
cancelled by transforming to a frame with constant acceleration so that 
(5.27b) becomes 


f. = mg—ma=0. (5.28) 


This observation played an important heuristic role in the development of 
Einstein’s theory of gravitation. It does indeed show that a uniform gravi- 
tational force cannot be distinguished from a fictitious force due to constant 
acceleration. However, the nature of gravitation is such that there is actually 
no such thing as completely uniform gravitational field, and the deviations 
from uniformity are sufficient to uphold the distinction between real and 
fictitious forces. 

Returning to Equation (5.25), we note that if the unprimed frame is 
inertial, then w = O and a = 0, so it reduces to 


mx =. (S29) 


This proves that a transformation between inertial systems is the most general 
transformation preserving the form of Newton's law. The general form of such 
a transformation can be deduced from (5.18), which, for vanishing rotational 
velocity and constant translational velocity a = v, reduces to 


x = Axe (5.30) 
where 7% is a time-independent rotation. Integrating (5.30), we obtain 
Ke ANe Pag tT Vi. (5.31) 


Since this applies to particles at rest as well as in motion, it gives us the 
general relation between points in two inertial systems. 

The group of transformations that leave the form of Newton’s law invariant 
is called the Galilean Group. This is the group of transformations relating 
inertial systems. Every element of the group can be put in the form (5.31) 
which shows that it can be expressed as a composite of a rotation 


=x SER ER, (5.32) 
a space translation 

Xx—>x+a, (5.32b) 
and a so-called Galilean transformation, 

x— x4 Vi. (5.32c) 


Note also that the form of (5.31) is unchanged by a time translation 


314 Operators and Transformations 


t>tt+a, (5.32d) 


so this transformation belongs to the group. Clearly the Galilean Group 
consists of Euclidean Group of rigid displacements extended to include 
Galilean transformations and time translations. The Galilean Group is a 
10-parameter continuous group with 3 parameters to determine the rotations, 
3 parameters for space translations, 3 parameters for Galilean transform- 
ations, and one parameter for time translations. 

We have interpreted (5.31) as describing the location of a single particle 
relative to two different reference bodies. It can be interpreted alternatively 
as a displacement of one or more particles relative to a single reference body. 
Newton’s law is form invariant in either case. The important thing is that 
(5.31) describes a change in the relation of particles relative to some inertial 
frame. 

The form invariance of Newton’s law under the Euclidean Group means 
that it provides a description of particle motions and interactions that is 
independent of the relative position and orientation of the reference body, 
which implies further that the reference body does not interact with the 
particles described by Newton’s law. Invariance under translations means that 
all places in position space are equivalent, that is, position space is homogen- 
eous. Invariance under rotations means that all directions in physical space 
are equivalent, that is, position space is isotropic. Similarly, invariance under 
time translations means that time is homogeneous. Thus, the laws of physics 
are the same at all times and all places. This crucial property of physical laws 
enables us to compare and integrate experimental results from laboratories all 
over the world without worrying about when and where the experiments were 
done. The astronomer uses it to infer what is happening on stars many 
light-years away, and the geologist uses it to ascertain how the Earth’s surface 
was formed. Scientific laws are valuable precisely because they describe 
features common to all experience. 

Note that for inertial frames related by a Galilean transformation 


x  =x+vf, (3:33) 


each particle of the unprimed frame is moving with constant velocity v with 
respect to the primed frame. Therefore each particle in the reference body of 
an inertial system is a free particle. Thus, the physical requirements of no net 
force on particles of the reference body distinguishes inertial frames from 
other reference frames. Of course, this is an idealization that can never be 
perfectly met in practice. 

The derivative of the Galilean transformation (5.31) yields the velocity 
addition formula 


kK =xk+v (5.34) 


relating the velocities of a particle with respect to two inertial frames moving 
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with relative velocity v. The important thing about this formula is the fact 
that, according to (5.26), the force f’ = f on the particles is the same in both 
reference systems. This enables us to do such things as analyze the motion of 
an object in the atmosphere or a river without considering motion relative to 
the Earth until the analysis is complete. Motion relative to the Earth can then 
be accounted for trivially with (5.33) or (5.34). 


5-5. Exercises 


(5.1) For a unitary spinor R = R(t) with the parametrizations 
R=e" =a+ ip = all + iy) 


show that the rotation velocity ia = 2RR* has the following para- 
metric expressions: 


wo = 2(ap - aB + BX B), 
where a’ + £? = 1 and aa =~ BB: 
o=2( Mae Be ): 
1+ y? 
o=na+nsina+nXn(l-—cosa), 
where n’ = 1, a = an and n X n = inn. 


Note that » = aif and only if n = 0. : 
(5.2) Four time-dependent unitary spinors R, satisfying R, = +R,Q, are 
related by the equation 


R, = R,R,R;. 

Show that their rotational velocities are related by the equation 
Q,—= R'(R'Q.R, + 2,)R, + Q,. 

(This will be useful in Exercises (5.3) and (5.7)). 

(53) For a unitary spinor R = A(t) with the Eulerian parametrization 

R= elle ef ae el 2a 

show the rotational velocity :@ = 2RRt has the parametric form 
w= go,+ OnX o, + wn, 

where 


n = Rto,R = a, cos 8 - a,e'** sin 6 
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(5.4) 


(5.5) 
(5.6) 


(5.7) 
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oP n X 6. ian a 
n X 6, = —————- = 6, e*® = a4, cos) + a, Sin ©. 
|n Xo, | 
The derivative of a time-dependent linear operator # = X, is de- 
fined by 


Show from this that “is a linear operator. Show that the usual rule 
for differentiating a product holds for the composite of time- 
dependent linear operators % and <5, that is, show that 


<() = RS + RS. 


This rule does not hold for the composite of arbitrary functions. 
Show however, that it also holds for the composite of rigid displace- 
ments. 

Fill in the steps in the derivation of Equation (5.17). 

Derive Equations (5.19) and (5.23) directly using 


R = +R. 


A wheel of radius 6 is rolling upright with constant speed v on a 
circular track of radius a (Figure 5.3). The motion of a point x on 
the wheel can be described by the equation 

x = RURrR, + a)R, = RrR+a=rt+a, 
which can be interpreted as follows: a 
fixed point r, on the wheel is rotated 
about the axis with constant angular 
velocity w, by a spinor R,; then the 
wheel is translated by a, along its axis 
from the center to the edge of the 
circular track where it is rotated with 
constant angular velocity w, by a 
spinor R,. Consequently, 

R = RGR == ef l2eit elie at 


; Fig. 5.3. Wheel rolling on a circu- 
(a) Determine @,, @, and the ro- lar track. 


tational velocity w = -2iR'R of the wheel about its moving 
center. 

(b) Calculate the velocity and acceleration of an arbitrary point x 
on the wheel. 

(c) Evaluate the velocity and acceleration at the top and bottom of 
the wheel. 
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5-6. Motion in Rotating Systems 


A choice of reference frame will be dictated by the problem under consider- 
ation. The frames which we have most occasion to use are distinguished by 
their choice of origin. A reference system (and its associated frame) is said to 
be heliocentric if its origin is at the center of the Sun, geocentric if its origin is 
at the center of the Earth, or topocentric if its origin is fixed on the surface of 
the Earth. None of these frames is inertial, since a topocentric frame rotates 
with the Earth, the Earth revolves about the Sun, and the Sun orbits in a 
galaxy. Let us evaluate the effect of these relative motions on the observed 
motion of objects on the Earth. 

The relative directions of the distant stars observed on the “‘celestial 
sphere” vary so little in “human time intervals” that they can be used as an 
absolute standard of rotationless motion. With respect to an inertial frame, 
then, the directions of the distant stars must be fixed in time. 

Let {x’} be the position space of an inertial system in which the Earth is 
initially at rest with its center at the origin. Let {x} be the position space of a 
geocentric system with the Earth as a body of reference. If we neglect, for the 
time being, the earth’s acceleration due to motion about the Sun, the two 
frames are related by 


x = xen xR, (6.1) 


where the operator 4 or the spinor R describes the rotation of the Earth 
relative to the fixed stars. The consequences of this relation were derived in 
Section 5-5, so we need only to summarize the relevant relations here. The 
Earth’s rotational velocity @ (in the rotating Earth system) is defined by the 
spinor equation 


= yioR (6.2) 
or the corresponding operator equation 
Rx = Fw X x). (6.3) 
The equation of motion 
mx’ =f’ (6.4) 


in the inertial frame corresponds to the equation of motion 

mx = f — mw X (w X x) — 2mw XK x- m@ X x, (6.5) 
in the Earth frame, with 

f— Ar ike (6.6) 


Let us examine, now, the effect of the real and fictitious forces in (6.5) on 
particle near the Earth’s surface. The term m@ X x is entirely negligible 
compared to the other forces, because variation in the Earth’s rotation period 
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T = 2m/w is of the order of milliseconds per year and variation in the 
direction of @ is comparably small. We shall see how to calculate w later on 
when we examine the rotational motion of the Earth itself in more detail. 

According to Newton’s law of gravitation, in the aproximation of a spheri- 
cal Earth, the gravitational force on a particle outside the surface of the Earth 
is 


wf GmMr 


5) 


t= = mG (6.7) 


where M is the mass of the Earth. Observe that, by (6.1) 
| —GmMr _ _GmMr' 
re 


r? 


RE= KR 


=f’, 
showing that (6.1) is consistent with (6.6). 


True and Apparent Weight 


The gravitational force f = mG exerted by the earth on an object is called the 
true weight of the object. The object’s apparent weight W is 


W = mg=m(G-o@X (wXr)). (6.8) 


This is the resultant of the gravitational and centrifugal forces (Figure 6.1), 
which are difficult to separate near the surface of the Earth, because they are 
slowly varying functions of position. To estimate the contribution of the 
centrifugal force, we use 


g=G-oX(oXr)=G+ (war) =Gt+ or-orw. (6.9) 
From this we see that g = |g| has the value 

Spole = |G| 
at the Earth’s pole and the value 

Beq = |G| - or 
at the Equator. Hence 

See sae (6.10) 
where we have "sed 

w = 27 radians/day = 7.29 X 10°s", . (Gai) 
for the angular speed and 

r, = 6370 km (6.12) 


for the mean radius of the Earth. The measured value of g,.1. — cq is 5.2 cm 
s '. The discrepancy between this value and the calculated value is due to the 
oblateness of the Earth. Indeed, ii can be used to estimate the oblateness of 
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the Earth. The value 
of g at sea level is 
about 980 cm s~*, so 
the relative contribu- 
tion of the centrifugal 
force varies by only 
half a percent from 
pole to Equator. 


Coriolis Force 


The transformation 
from the geocentric 
system {x} to a to- 
pocentric system {r}, 
as shown in Figure 
6.2, is a simple trans- 
lation 


Fig. 6.1. Relation between true and apparent weights (not to 
x=—r+a,(6:13) Seale). 


where a is a fixed point on the Earth’s surface. Substituting (6.13) into (6.5), 
we get, for constant w, the equation of motion 


r=g-20xr, (6.14) 


where the centrifugal force has been incorporated into the “gravitational 
force’ mg and non-gravitation forces have been omitted for the time being. 

From (6.14) we can calculate the 
effect of the Coriolis force on projec- 
tile motion in the approximation 
where g is constant. Actually, in Sec- 
tion 3-7 we found the exact solution 
to (6.14) for constant g and w. In the 
present case, however, for typical 
velocities we have | 2r X w| << g 
because of the relatively small value 
(6.11) for the angular velocity of the 
Earth. Consequently, a perturbative 
solution to (6.14) is more useful than 
the exact solution. We could, of 
course, get the appropriate approxi- 
mation by expanding the exact sol- 
ution, but it is at least as easy to get it 
directly from (6.14) in the following 
way. 


Fig. 6.2. A topocentric frame with latitude 0. 
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Let us write v = r so (6.14) takes the form 
vV=gt+2vXo. (6.15) 


Regarding the last term in (6.15) as a small perturbing force, the equation can 
be solved by the method of successive approximations. We write the velocity 
as an expansion of successive orders in w, 


v=v,tvi,tv,+.... (6.16) 


The zeroth order term v, is required to satisfy the unperturbed equation 
Vv; = g, which integrates to 


Yi gt at Vo (6.17a) 


where Vv, is the initial velocity. Inserting v to first order in Equation (6.15) we 
get 


V=V,+ Vv, = 2g + 2, + v,) XK o. 


Neglecting the second order term 2v, X w, this equation reduces to an 
equation for v, when (6.17a) is used; 


Vv, = 2v, X w = 2(gt + Vo) X wo. 
This integrates to 
v, = (gf? + 2v,t) X o. (6.17b) 


In a similar way, we can determine the second order correction v, and higher 
order terms if desired. 
Substituting (6.17a, b) into (6.16), we have the velocity to first order in w 


v= +7,+e9t+ (2v,t+9f)Xot+... (6.18) 
Integrating this, we get a parametric equation for the displacement 

r= sg + v¢+ Ar, (6.19) 
where the deviation Ar from a parabolic trajectory is given to first order by 

Ar=(v,+72t)Xoft+... (6.20) 


To estimate the magnitude of the correction Ar, we observe from (6.19) 
and (6.20) that 
| Ar | 
= wt 


| r | 


(6.21) 


For the correction to be as much as one percent, then, we must have 
wt 2 0.01, and from the value (6.11) for w we find that the time of flight ¢ 
must be at least two min. As the time of flight in a typical projectile problem is 
less than 2 min., we need not consider corrections of higher order than the 
first. Indeed, before higher order corrections are considered, the assumption 
that g is constant should be examined. 
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As it stands, the expression (6.20) for Coriolis deflection Ar is not in the 
most convenient form, because it is not given a function of target location r. 
This can be remedied by using the zeroth order approximation 


r=+gf + vit 


(6.22) 


to eliminate v, in (6.20), with the result 


Ar =-tw X (r—<gf). 


(6.23) 


This shows the directional dependence of Ar onr. If needed, the dependence 
of t on r can be obtained from (6.22); thus 


ra : PAV, 
t=—“8 and tC = ae 
V,Ag BAY, 
Notice that 


TAY, 


~ ] 
3 


BAY, 


(6.24) 


Je) 


showing that the two terms in (6.23) are of the same order of magnitude. 
From (6.22) we find that the change in range due to the Coriolis force is 


(6.25) 


Similarly, the vertical deflection is found to be 


Fig. 6.3. 
ward motion in both Northern and Southern 


hemispheres. 


Coriolis effects are largest for west- 


g:Ar = tr:(w X 8). (6.26) 
The vector w X g is directed West 
(Figure 6.3), except at the poles, so 
both (6.25) and (6.26) vanish for tra- 
jectories to the North or South. They 
have maximum values for trajectories 
to the West. This is due to rotation of 
the Earth in the opposite direction 
while the projectile is in flight, as can 
be seen by examining the trajectory in 
an inertial frame. 

In most circumstances, resistive 
forces will have a greater effect on the 
range and vertical deflection than the 
Coriolis force. The lateral Coriolis de- 
flection is more significant, because it 
will not be masked by resistive forces. 
So let us examine it. For a target on 
the horizontal plane g:‘r = 0 and g X Fr 
is a unit nghtward vector. From 
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(6.21), therefore, the rightward deflection AR is given by 


AR @x#)-Ar=-1@x8)-(@x(r-Fg}] 


igar)(wa(r-“e})=1[-@ o-9 feo 


Using (6.24) we have 


AR = (@ X f)-Ar =-rto- g + 4( he). 6.27 
@x Arar =-ro-(g+4(Pe}e (6.27) 
For nearly horizontal trajectories rav, ~ 0, so AR = — rtw-g, which is 


positive in the Northern hemisphere and negative in the Southern hemisphere 
(Figure 6.3). As a general rule, therefore, the Coriolis force tends to deflect 
particles to the right in the Northern hemisphere and to the left in the Southern 
hemisphere. This rule is violated, however, by highly arched trajectories, and 
from (6.27) one can determine the trajectory without deflection to a given 
target. 

Explicit dependence of the lateral Coriolis deflection on latitude 6, mag- 
netic azimuth @ and firing angle € can be ascertained by reading off the 
necessary relations from Figure 6.4 to put (6.27) in the form 


AR = rtw cos 6 (tan 9@—+ tan €cos @). (6.28) 


0 
The condition for vanishing deflection is 
therefore 


3 tan @ 
tan & = ee (6.29) 


and in the Northern hemisphere deflection 
will be to the left for e > €, and to the right 
fore <05,. 

The Coriolis force plays a significant role 
in a variety of natural processes, most 
prominently the weather. It is, for exmple, 
responsible for the circular motion of cy- 
clones. To see how this comes about, con- 
sider the following equation of motion for a Fig 6.4. Topocentric directional ae 
small parcel of air: ameters. 
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Here @ is the mass density of the air and P = P(x) is the air pressure, so VP 
describes the local direction and magnitude of the change in air pressure. A 
cyclone is a system of concentric isobars (lines of constant pressure), as shown 
in Figure 6.5. As the figure suggests, a 
parcel of air at rest will be accelerated in 
the direction — VP of lower pressure. 
The Coriolis force increases: with vel- 
ocity deflecting the air from the direc- 
tion —~ VP until the condition of 
equilibrium is reached 


isobars 


Qv Xx o- VP = 0, 


Fig. 6.5. Circulation of air in a cyclone. 


where the air is circulating with constant speed along the isobars. This motion 
tends to preserve the pressure gradient. The circulation is counterclockwise in 
the Northern hemisphere and clockwise in the Southern hemisphere. Nat- 
urally, cyclones arise most frequently in regions of the Earth’s surface where 
the Coriolis force is greatest. Of course, our description here is highly 
idealized, and as a result of effects we have neglected the air flow in a cyclone 
is not precisely along the isobars. 

Cyclones do not occur on the Earth’s Equator. However, heating at the 
Equator causes air to rise, and the air rushing in to replace it is affected by the 
Coriolis force. Consequently, the ‘‘trade winds” come from the North-East 
just North of the Equator and from the South-East just South of the Equator. 


Foucault Pendulum 


The Coriolis force produces a small precession, or rotation with time, of a 
pendulum’s plane of oscillation. To exhibit this effect for the first time and 
thereby demonstrate that the Earth is rotating, Leon Foucault constructed a 
heavy pendulum of great length in 1851. Accurate measurements of the 
precession were not made until 1879 by Kamerlingh Onnes in his doctoral 
thesis. 

The equation of motion for the bob of such a pendulum is 


T r 
i 2g ae 6.30 
r+2oxXr=g+ SF - ( ) 
where r = |r| is the length of the pendulum and T is the tension in the 


suspension, as shown in Figure 6.6. 
We are interested only in the horizontal component of motion. To separate 
horizontal and vertical components in the equation of motion, we write 


324 Operators and Transformations 


r=x+ 26, (6.31) 0 


where x-g = O (Figure 6.6). Simi- 
larly, we decompose into hori- 
zontal and vertical components, 


O=O, + a, (6.32) 
with 
@) = 88° @ 
= — gw sin 0 (6.33) 


where @ is the latitude, as indi- 
cated in Figure 6.3. The com- 
ponents of the Conolis acceleration are given by 


Fig. 6.6. Parameters for a pendulum. 


wX x = (w, + @) X (x + zg) 
=o, Xx+@,Xx +o X 82. 
Thus, when (6.31) and (6.32) are substituted into (6.30), the equation of 


motion can be separated into the following-pair of coupled differential equa- 
tions for the vertical and horizontal motions: 
ya 


z+ 2x-(€ X w) = g-z : (6.34) 
mr 


on Xk oe (6.35) 
mr 
For the small amplitude oscillations of a Focault pendulum, these equations 
can be decoupled to a good approximation. 

We can simplify (6.35) before trying to solve it by using information about 
the unperturbed periodic motion of a pendulum. The term 2m X gz in (6.35) 
is periodic as well as small, for w. x g has a fixed direction and Z is positive on 
the upswing and negative on the downswing. Its average value over half of 
any period of the pendulum is obviously zero. Therefore we can drop that 
term from the equation of motion (6.35), because we are interested not in 
details of the pendulum motion during a single swing but in the cumulative 
effect of the Coriolis force over many swings. Now consider the ‘“‘driving force 
term” — x7/mr on the right side of (6.35). This term is already an explicit 
function of x, so we get the first order effect of this term on small amplitude 
oscillations by regarding the coefficient 


w= — (6.36) 


as constant. Thus (6.35) can be put in the approximate form 
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This is an equation we have encountered and solved before. It is identical to 
the equation for a changed harmonic oscillator in a uniform magnetic field. 
The solution of (6.37) for the initial conditions x(0) = a and x(0) = 0 is 


x = ae“ cos ua (6.38) 
T 
2,7 where 
— 
20 


is the period of the pendulum. The 
solution (6.38) describes an oscillator 
precessing with constant angular vel- 
ocity 


—w, = £ wsin 8 (6.40) 
In the Northern hemisphere 
sin 6 > 6, so the precession is clock- 
wise about g. Thus, the bob is conti- 


SS 


7] 


Fig. 6.7. Projection of the path of a pendu- 


nually deflected to its right as it 
swings, with an angular displacement 


27w 


lum bob on a horizontal plane showing the oT =————_ 
(exaggerated) Coriolis precession. (w + w)'” 
in a single period, and cusps in the orbitatr = $7.7.) T..... as shown in 


Figure 6.7. 


Rotation and Orbital Motion 


As the Earth rotates about its axis, it also revolves about the Sun. Let us see 
how these motions contribute to the resultant rotational motion of the Earth. 
Let {x”} be a heliocentric inertial system and let {x}, as above, be a geocentric 
system fixed with respect to the Earth. These two reference systems are 
related by the equation, 


a Rt(RtxR, +a,)R,= RtxR +a, (6.41) 
where 


and 
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R=R;,R,. (6.43) 


The various quantities involved require some explanation. 


Fig. 6.8 Contribution of orbital motion to the Earth’s rotation. 


The translation vector a designates the location of the Earth’s center. The 
spinor R, determines the rotation of a and so the rotation of the Earth about 
the Sun, as expressed by (6.24). The vector a, is constant if the Earth’s orbit is 
regarded as circular, but its length varies slightly with time for an elliptical 
orbit. The rotational velocity w, of the Earth about the Sun in the geocentric 
inertial system {x’’} is given by 


R,=7R,io, (6.44) 


The period of this rotation is, of course, 


20 
S| 


Forces of the various planets on the Earth cause small time variations of @,, 
but, for present purposes, w, can be regarded as constant. 

The spinor R, describes the rotation of the Earth about its axis with respect 
to a frame orbiting the Sun with the Earth. The corresponding rotational 
velocity in the Earth system {x} is given by 


T= = 365.25 solar day. (6.45) 


R,=7% iw,R-. (6.46) 
The corresponding period of rotation is 
T, = — = 1 solar day. (6.47) 


This period is directly observed on Earth as the time it takes for the Sun to 
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repeat its apparent position relative to the Earth during one revolution of the 
Earth. 

The spinor R describes the rotation of the Earth in any inertial system 
regardless of how the origin has been chosen. The corresponding angular 
velocity in the Earth frame is given by 


R = sioR. (6.48) 


The corresponding period of rotation is 


= a = 1 sidereal day. (6.49) 


This period is directly observed on Earth as the time it takes for the fixed stars 
to repeat their positions relative to the Earth during one revolution of the 
Earth. 

The relation among the various rotational velocities is determined by 
differentiating (6.43). Thus, using (6.44) and (6.46) as well as the unitarity 
property of the spinors, we obtain 


R =R,R, + R,R, = yiw,R,R, + R-R, rio, 


= ti(w, + Rw,Rt)R. 


Hence, by (6.48), 
o=o,+ RoR. (6.50) 


This is the desired relation among the various angular velocities referred to 
the Earth’s frame. The spinor R appears explicitly in (6.50), because w was 
defined in an inertial system, so it must be transformed to the corresponding 
angular velocity Rw,Rt in the Earth frame. Since @ is essentially constant, 
equation (6.48) has the solution 


— /2)i 
R = ePiwt 


Therefore, the vector Rw,Rt precesses about w with period of one day. 
Equation (6.50) implies that #, must also precess about w, as illustrated in 
Figure 6.8. By measuring to the ecliptic (the apparent path of the Sun) it is 
found that @-@, = cos(23.50) = 0.91, so from (6.50), 


365 


This implies that orbital motion about the Sun is responsible for about 
0.91/365 = 0.25% of the Earth’s rotational velocity. It is equivalent to saying 
that the solar day is 3.6 minutes longer than the sidereal day. 


SEE eoIOne = ( re O51 | on (6.51) 
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Larmor’s Theorem 


The Coriolis force is important in atomic physics as well as terrestial mech- 
anics. For example, the vibrational and rotational motions of polyatomic 
molecules are coupled by the Coriolis force. Here we employ it in the proof of 
a general result of great utility, Larmor’s Theorem: The effect of a weak 
uniform magnetic field B on the motion of a charged particle bound by a central 
force is to cause a precession of the unperturbed orbit with rotational velocity 
w, = — qB/2mc, where q is the charge and m is the mass of the particle, the 
constant c is the speed of light and w, = | w, | is called the Larmor frequency. 

The proof of Larmor’s theorem is based on the formal similarity of the 
Coriolis force to the magnetic force. We are concerned with a particle subject 
to the equation of motion 


mi’ = f(r’)r' + aa R (6.52) 


Here the system {r’} with origin at the center of force, typically the center of 
an atomic nucleus, is regarded as an inertial system. In an attempt to simplify 
the equation of motion by a change of variables, we introduce a rotating 
system {r} defined by the equations 


r’ = Xr = RtrR (6.53) 
and 

R = +iwR = Roi : (6.54a) 
Or 

Fe = Kw Xr) =0' Xr’. (6.54b) 


The motion of the particle in the rotating frame is described by rr = r(r), while 
the operator 7% describes the motion of the frame itself and r’ = r’(t) de- 
scribes the composite of these two motions. In the rotating frame, the 
equation of motion (6.52) becomes 


r+2oXr+oX(woXr)+oXr 


ais 


te (r+@mxXr) XB, (6.55) 


where B = %'B’. 
Evidently, the Coriolis force can be made to cancel the magnetic force in 


(6.55) by selecting the rotating frame so that 


o=o,=--*B. (6.56) 
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Whereupon (6:55) becomes 
p= + ox (oxr)-oxr. (6.57) 


If the magnetic field is weak and slowly varying in time, the last two terms in 
(6.57) are small compared to the binding force and we have, approximately, 


mr = fire. (6.58) 


Thus, we have succeeded in transforming away the perturbing force from the 
equation of motion (6.52). So we have proved Larmor’s theorem. 

Larmor’s theorem has the great advantage of decoupling the effect of an 
external magnetic iicld from that of the binding force so they can be studied 
separately. According to (6.58), the motion in the rotating frame is the same 
as motion in an inertial frame without a perturbing magnetic force, so we call 
this the unperturbed motion. 


Precession 
in time 


Fig. 6.9(a). Unperturbed elliptical orbit = orbit in rotating frame. (b) Perturbed orbit in 
inertial frame for the case B-v = 0. 


Let us compare the perturbed and unperturbed motions. The condition 
that (6.58) be a good approximation of (6.57) can be expressed in the form 


qB \’ _ >. (fr) ) 
| | ~ OL SO = "mr ave (6.22) 
Here the average value of f(r)/mr is evaluated over a period of the unper- 
turbed motion. The resulting constant w, can be interpreted as the frequency 
of the unperturbed motion. Indeed, it will be noted that for circular orbits 
Equation (6.58) can be put in the form fF = wir, confirming consistency with 
the interpretation we have previously given to w, for the harmonic oscillator. 
Thus, the condition (6.59) for the validity of Larmor’s theorem requires that 
the frequency of the unperturbed motion be much larger than the Larmor 
frequency. From our study of central forces in Section 4-5, we know that the 
unperturbed orbit is elliptical or nearly so. So the perturbed orbit can be 
visualized as a slowly precessing ellipse, as illustrated in Figure 6.9b. 
What are typical values for w, and q, in actuality? The rate of orbital 
precession can easily be estimated from known values of the constants g/mc; 
thus 
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gB 


ae (0.9 x 10’ sec’! gauss ')B. (6.60) 
m 


0, =| 


The largest magnetic fields attained in the laboratory (with superconducting 
magnets) are of order 10° gauss, in which case 


@, = 10" sec", (6.61) 


so the orbit makes a complete precession in about 10°" sec. To estimate wy, 
we need some results from the quantum theory of atoms. According to 
quantum theory, the orbital angular momentum of an atomic electron is an 
integral multiple of Planck’s constant h, so 


f=|mrxF!| Sn = 1.05 x 107 erg sec. 


For a circular orbit we have / = mrr and + = wr, so with an order of 
magnitude estimate of the atomic radius, we find 


fi 10°” erg sec 


mee ois loa me) 


W) = 
We can conclude, then, that only high precision experiments will reveal 
deviations from Larmor’s theorem. 

The orbits of electrons in atoms cannot be observed. But angular momen- 
tum and energy are constants of motion in a central field, and changes in their 
values for atoms can be measured. So let us see how these constants of motion 
are affected by a magnetic perturbation. The velocities of perturbed and 
unperturbed motions are related by 


i =Ar+oXr). (6.63) 
So the angular momenta I’ = mr’ X r’ and1 = mr X Fr are related by 
= Al+ mr X (wX r)) = Fl + mr’ X (o! Xr’). (6.64) 


The term mr’ X (@’ X r’) is called the induced angular momentum. It is of 
prime importance when I = 0. However, according to our above estimates of 
@, and w = w,, when! # 0 the induced angular momentum Is usually smaller 
in magnitude than | by a factor of 10° or more, so 


1~ Al = Rik. (6.65) 
Differentiating, we have 

V= Al +o). (6.66) 
For central binding forces I’ = 0, so 

= Rox) =o' XV. (6.67) 


Notice that this equation of motion for angular momentum is completely 
independent of any details as to how electrons are bound to atoms. 
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We have become very familiar with the solution of an equation with the 
form (6.67) for constant w’. Thus, from the spinor equation (6.54a) we get 


R= e(l/2)iawt 


with 


nae 
2mc 


and the initial condition I’ = 1 at t= 0. In this case, (6.65) becomes the 
explicit equation 


I! = Ee /2)iwt JeG@iet | (6.68) 


This describes a uniform precession of I’ about w, as shown in Figure 6.10. In 
Chapter 7, we will investigate solutions of (6.67) for time varying magnetic 
fields leading to the phenomenon of magnetic re- 
sonance, a phenomenon of great importance for 
investigating the atomic and molecular structure of 
matter. 

Let us turn now to energy considerations. We 
know that a conservative central force is derivable 
from a potential, so writing 


{Or — AV(a), 
we get the familiar expression 


E =} mi? + V(r) 


for the energy of the unperturbed system. We have 
observed before that the magnetic field does not 
contribute to the potential energy, so from (6.52) 
Fig. 6.10. Precession of angu- Wwe get 

lar momentum in a constant : 


magnetic field. | + mr” + V(r’) 


for the energy of the perturbed motion. A magnetic field affects the energy 
only by altering the kinetic energy, so to make the influence of the magnetic 
energy explicit we must compare the energies of perturbed and unperturbed 
motions. From (6.63) we have 


p= (Pe Xr) = FP + ere Xr) + (oo Xr). 


Hence, 
E'=E+o@1+75m(o Xr). (6.69) 


The last term in (6.69) is neglected in the Larmor approximation, so we can 
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conclude that a magnetic field induces a shift in the energy of a bound charge 
particle by the amount 


ol =— -7 B41. (6.70) 
2mc 
This energy shift in atoms is easily observed by modern methods, and it is 
known as the Zeeman effect. From (6.70) it follows that the energy w-l ts 
constant for a constant magnetic field. For large magnetic fields the shift in 
energy due to the last term in (6.69) can also be observed. This is known as 
the Quadratic Zeeman (or Paschen-Bach) effect, since it varies quadratically 
with the magnetic field strength. Like its relative, magnetic resonance (dis- 
cussed in Section 7-3), the Zeeman effect is of great value for probing the 
structure of matter. 


5-6. Exercises 


(6.1) On the surface of the Earth, the true vertical is directed along a line 
through the center of the Earth, and the apparent vertical is di- 
rected along a plumb line. Determine how the angle a between true 
and apparent verticals varies with latitude 6 (Figure 6.1). Estimate 
the maximum value for a and the latitude at which it occurs. 

(6.2) At about sea level and a latitude of 45° a 16 pound (7.27 kg) steel 
ball is dropped from a height of 45 m. 

(a) Calculate the displacement of its point of impact to first order in 
the angular velocity of the earth. 

(b) It is argued that while the ball is falling, the Earth will rotate 
under it to the East, so the displacement will be to the West. 
Show what is wrong with this argument by describing what 
happens in an inertial frame. 

(c) Estimate the effect of the Corio- 
lis force to order w”. 

(6.3) A free particle is constrained to move 
in a honzontal plane of a topcentric 
frame, but it is confined to a region 
with circular walls from which it re- 
bounds elastically (Figure 6.11). 
Show that the direction of the par- 
ticle motion precesses at exactly 
twice the rate of a Foucault pendu- 


lum. How can this difference in pre- ge a era 
i - : 
ns rates be accounted for’ Fig. 6.11. A free particle, reflected 
(6.4) At a point on the Earth’s Equator, by circular walls, precesses at twice 


determine the relative magnitude the Foucault rate. 
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(6.5) 


of centrifugal forces due to orbital motion about the Sun and rotation 
of the Earth. (The mean radius of the Earth’s orbit is 1.495 x 10° km; 
the mean radius of the Earth is 6371 km). 

List all the forces and effects you can think of (at least 10) which are 
neglected when a projectile launched from the surface of the Earth 
is described as moving with constant acceleration. Estimate their 
magnitudes and describe the conditions under which they will be 
significant. 


Chapter 6 


Many-Particle Systems 


This chapter develops general concepts, theorems and techniques for model- 
ing complex, mechanical systems. The three main theorems on system en- 
ergy, momentum, and angular momentum are proved in Section 6-1. These 
theorems provide the starting point for rigid body mechanics in Chapter 7, as 
well as for other kinds of mechanical system discussed in this chapter. The 
method of Lagrange formulated in Section 6-2 provides a systematic means 
for expressing the equations of motion for any mechanical system in terms of 
any convenient set of variables. The method proves to be of great value in the 
thecry of small oscillations as well as applications to molecular vibrations 
trea‘ed in Section 6-4. The general theory of small oscillations in Section 6-4 
raises the level of sophistication required of the student, so some examples of 
small oscillations are treated in Section 6-3 by more elementary means. 

The final section of this chapter discusses the most venerable unsolved 
problem in celestial mechanics, the Newtonian 3-body problem. It comple- 
ments the development of celestial mechanics in Chapter 8. 


6-1. General Properties of Many-Particle Systems 


Classical mechanics provides us with general principles for modeling any 
material body, be it a solid, liquid or gas, as a system of interacting particles. 
To analyze the behavior of a system, we must separate it from its environ- 
ment. This is done by distinguishing external and internal variables. The 
external variables describe the system as whole and its interaction with other 
(external) systems. The internal variables describe the (internal) structure of 
the system and the interactions among its parts. The analysis of internal and 
external variables for a two-particle system was carried out in Section 4-6. 
Here we analyze the general case of an N-particle system. The results include 
three major theorems: (1) the center-of-mass theorem, (2) the angular mo- 
mentum theorem, and (3) the work-energy theorem. These theorems pro- 
vide a starting point for the modeling of any complex mechanical system. 


334 


General Properties of Many-Particle Systems 335 


The superposition principle allows us to separate external and internal 
forces; so the equation of motion for ith particle in an N-particle system has 
the form 


c N 
mx;=F;+ > f,. Clay 
j= 


where the external force F; is the resultant force exerted by objects external to 
the system, and the interparticle force f,, is the force exerted on the ith particle 
by the jth particle. Also, we assume that f;, = 0, that is, that a particle does 
not exert a force on itself. 

According to the weak form of Newton's 3rd law, mutual forces of any two 
particles on one another are equal and opposite so the interparticle forces are 
related by 


f,;, = —f,. (1.2) 


The strong form of Newton’s 3rd law holds also that all two-particle forces are 
central forces, that is, directed along a straight line connecting the particles. 
The condition that interparticle forces be central is expressed by 


(xj— x)Af, = 0. (1.3) 


We adopt the strong form of Newton's 3rd law, because it holds for a large 
class of systems, and it greatly simplifies the analysis. Deviations from the 3rd 
law arise principally in systems composed of particles which are not accurately 
described as structureless point particles but have some internal structure 
which significantly affects their interactions with other particles. However, a 
deeper analysis may show that the structure of such a particle can be 
described by assuming that the particle is itself composed of structureless 
point particles. The general results derived below provide the foundation for 
such analysis. 


Translational Motion 


Now, to develop an equation describing the translational motion of the 
system as a whole, we add the equations of motion (1.1) for each particle; 
thus, 


Qk, = 2 Fy 2 Ey (1.4) 
i i ij 


The weak form of Newton’s third law (1.2) implies that the internal forces in 
the sum cancel; formally, 


2Af= TUG =- 2 Uf, = 0, 
ij (2 ij 


- (2% mxX;) = > Fe GES) 
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To put this in the standard form for a particle equation of motion, we define 
the following set of external variables for the system: 


Total mass, 

M=2m,, (1.6) 
Center Baas (CM), 

x = > x zu (1.7) 

=e oe 2m, ; 

Total momentum, 

P = MX = Zi, (1.8) 
Total external force, 

F= > F;. (1.9) 


In terms of these variables, Equation (1.5) has the form 
P =Sie=F. (1.10) 


This equation, along with its interpretation, is the Center of Mass (CM) 
Theorem. 

The system is said to be isolated if F; = 0, that ts if the external force on 
each particle vanishes. For an isolated system, then P = 0, so the momentum 
of the system is a constant of the motion, in other words, the momentum of an 
isolated system is conserved. 

According to (1.7), the CM X is a kind of average position of the particles 
in a system. The CM theorem (1.10) tells us that the motion of the CM is 
determined by the total external force alone, irrespective of the internal 
forces. This independence of internal force is a consequence of the weak form 
of the 3rd law (1.2), so empirical verification of the CM theorem supports the 
3rd law. 

The CM theorem describes the average motion of a system as equivalent to 
that of a single particle of mass M located the CM X. Thus, it separates 
external and internal motions, allowing us to study them independently. And 
when we are not interested in internal structure, the CM theorem justifies 
treating the entire system as a single particle. 

Internal and external properties of the system can be separated by intro- 
ducing the internal variables 


r,=x,-X GET) 


describing the position of each particle relative to the CM. The internal 
velocities are therefore 


r,=x,-X 
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This leads us to the following decomposition of the total kinetic energy K for 
the system: 


R= Soe rk (Xe) 


l ! 


2) pesisy sh yaaa “(2 Crt 
I 


But 
> mr, = Dmx; — X) = Dmx; - MX = 0. 
l rf t 
Hence, 
K = SMX? = Key + Kin (1.12) 
f 
where 
Kin = Dh (1.13a) 


is the internal kinetic energy, and 
Key => MX? (1.13b) 


is the CM kinetic energy, also known as the translational kinetic energy of the 
system. Thus, the total kinetic energy the sum of internal and external (CM) 
kinetic energies. 

Similarly, the total angular momentum J for the system submits to the 
decomposition 


J= > mx, X X= Bm (% + 1) X (X + F)) 


(2 m))X XX + (2 me) x X + XX (2 mit ;) + pa myx; X £;. 
1 


Hence, 

J= Bmx, X x= MXXX+1 (1.14) 
where 

b= me xt, (1.15) 


is the internal angular momentum, and X X P = MX X X is known as the 
orbital angular momentum of the system. Thus the total angular momentum is 
simply the sum of internal and external (orbital) angular momenta. To 
conform to standard usage, we have defined the angular momentum with 
cross products instead of outer products, so it is a vector instead of a bivector. 
However, as explained in Section 4-1, the bivector form is more fundamental 
and we shall switch to it later when it is advantageous. 
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Rotational Motion 


To describe rotational motion of the system as a whole, we derive an equation 
of motion for the total angular momentum. Differentiating (1.14) and using 
(1.1) and (1.2), we have 


: d pal a ” 
ie ay (MX % i) = BX, X %; 


2x; xX (F; + > fj) 


2x, ania Tye (x; — Xj) X ff. 


For central forces the last term vanishes, and we have the rotational equation 
of motion 


J=2x,xF=Mh, (1.16) 


where I, is known as the torque (about the origin). It is readily verified that a 
displacement of the origin changes the value of J and /,, without altering the 
form of (1.16). 

The most notable property of the rotational equation of motion (1.16) is 
the fact that internal forces do not contribute directly to the torque. Our 
derivation shows that this is a consequence of Newton's 3rd law in its weak 
form. Note, however, that the torque in (1.16) depends on the values of the x, 
whose time variations depend on the internal forces. Therefore, the torque ts 
indirectly dependent on the internal forces. Of course, for an isolated system 
the torque vanishes for arbitrary internal forces. Hence the angular momen- 
tum of an isolated system is conserved. 

It is usually desirable to separate the total angular momentum into its 
external and internal parts, for the parts satisfy independent equations of 
motion. Time variation of the external (orbital) angular momentum is deter- 
mined by the CM equation (1.10); thus, 


© (MX x X) = MX XK=XXF. (1.17) 
On the other hand, 
j= SMX xK +) =XxF Hi. 


Substituting this into (1.16), we get the equation of motion for the internal 
angular momentum: 


l= (xj,-X)XF-2 4X Rah (1.18) 


In contrast to (1.16), this equation is independent of the origin, or rather, it 
differs from (1.16) by a shift of the arbitrary fixed origin to an origin intrinsic 
to the system, the center of mass. 
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The angular momentum equation of motion (1.18) is the second major 
result of general many-particle systems theory. Let us call it the angular 
momentum theorem. To make use of this theorem, however, we need further 
results relating the angular momentum to kinematical variables of the system. 

Often we wish to separate the rotational motion of the system as a whole 
from the relative motions of its parts. This can be accomplished by introduc- 
ing a body frame rotating with the system. The position r; of a particle in the 
body frame is related to the internal position variable r, = x, — X by a rotation 


r, = Rtr'R. (1.19) 


As we saw in Section 5-5, the time dependence of the rotation is determined 
by the differential equation 


R = + Rio, (1.20) 
so that 
r= o Xr, + Re'R. (1.21) 


The vector @ is called the rotational velocity of the system or body, if you will. 
Substituting (1.21) into the expression (1.15) for the internal angular 
momentum, we obtain 


l= 2 myx; X (w X r;) + RZ mr, Xri)R. (1.22) 
The first set of terms on the right side of (1.22) defines a linear function, 

Iw = > mx; X (@ Xr), CiE23) 
so (1.22) can be put in the form 

l= Jo + RZ mr’ X £')R. (1.24) 
The linear operator ¥ is called the inertia tensor of the body. Using the 
identities 

rX (ow Xr) =—-r-(@ar) = rraw = rw -rr-@ 
we can write the inertial tensor in the alternative forms 

'Jw= > MrT;AW = 2 mri w—Vr,;o). (1.25) 


For most purposes, (1.25) is more convenient than (1.23). 

We have not yet explained how the body frame is to be determined. For a 
rigid body, it is determined by the condition that all particles of the body be at 
rest in the body frame. Thus r; = 0, so the distances between particles 

ry = | %;— x; | = brane = |rj—r; | 
are constants of the motion. Consequently, from (1.24) it follows that, for a 
rigid body, 


=e. (1.26) 
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This reduces the complex ‘‘dynamical variable” | to the simple “kinematical 
variable” w, because the inertia tensor £ depends only on the fixed internal 
structure of the body. Since ¥ is a linear operator, when (6.26) is substituted 
into the equation of motion (6.18), one obtains 


l= Jo + Jo =T, (127) 


The time derivative of the inertia tensor can be computed from the definition 
(1.25) using r = w X r;. Thus, for an arbitrary vector argument a we have 


ga = - m{- rr;a = ryf;a) 
i — Lm(o x eg Greet + r(@ x fy 
= 2m{o X (r?a—ry;a) —1r?(@ X a) + ry7r;(@ X a)] 


= ym{lo X (ryrjAa) — (rrjA(@ X a)]. 


Whence, the general result 


Ja = w X (fa)— J(w X a). (1.28) 
Using this in (6.27), we obtain 
l= fo+oX (fo) =P. (1.29) 


All the internal motion of a rigid body is rotational, and it is completely 
determined by the equation of motion (1.29), with appropriate initial con- 
ditions, of course. This will be the starting point for the study of rigid body 
motion in Chapter 7. 

For an arbitrary system of particles, such as a gas, the body frame may be 
difficult to determine. Actually, we have defined the body frame only for a 
rigid body, and we are free to define the body frame in any convenient way 
for other systems. For example, for a system consisting of a gas in a rigid box, 
we could choose the body frame of the box as body frame for the entire 
system. Then, Equation (1.24) would be interpreted as a separation of the 
angular momentum into a part %q@ for “the system as a whole” and a residual 
angular momentum 2,m,r/ X r/ of the gas in the box. In general, however, it 
is more natural to define the body frame by imposing the condition 


x mx, X #=0 (1.30) 


on motions ir the body frame so ! = #%q@ describes the resultant angular 
momentum of the entire system. This agrees with our definition for rigid 
bodies. Of course, for an arbitrary system the derivative of the inertia tensor 
will not be given by the expression (1.28) for a rigid body; it will include terms 
describing change in the structure of the system as well as the terms in (1.28) 
which are due solely to rotation. 
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Separation of rotational motion from vibrational motion is of great import- 
ance in the theory of polyatomic molecules. A molecule can be modeled as a 
system of atoms (point particles) vibrating about equilibrium points a, in the 
body frame. For small vibrations, then, the condition (1.30) can be approxi- 
mated by 


This can be integrated directly, and the integration constant can be chosen so 
that 


> ma, xr: (ps2) 


This is the appropriate condition on the relative atomic displacements deter- 
mining the body frame. It must be used along with the center of mass 
condition 


x mr; = 0. (1.33) 


Because of its simplicity, the condition (1.31) or (1.32) is preferable to (1.30), 
to which it is equivalent only in the first approximation. 

To separate the rotational energy from the rest of the internal energy in a 
system, it is convenient to use (1.21) in the form 


r= Ro! Xr t+ PR, (1.34) 
where 
cn SRR RR (1.35) 


l 
is the angular velocity in the body frame. Then the internal kinetic energy K,,, 
can be written 

Kin = Teme; = 7B mil(o! X vif + 2o'-(r; X F) + F?). 
Using the identity 
(@ Xr)? = (war)-(rA@) = @-(rraqw), 


we can write 


2 m(o'x ri)? = o'-( Fa"), (1.36) 
t 
where 
S'O = Str Ao. (1237) 
t 


is the inertia tensor in the body frame. Then the internal kinetic energy 
assumes the form 


King = TO'(F'0') + w'-(Smx; X Fi) +z Ame?. (1.38) 
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For a polyatomic molecule or, more generally, for a solid body, we can write 
r=: > S.. 


where sg, is the displacement from the equilibrium position a,. Then, because 
of the condition (1.31), the kinetic energy (1.38) reduces to 


Kin = 2 OF" + w'-(% ms, X §) + +B ms. (p22) 


The first term in (1.39) is the rotational energy and the last term is the 
vibrational energy. The middle term is the Coriolis energy, coupling the 
rotational and vibrational motions. In many circumstances it 1s small, so 
rotational and vibrational energies can be considered separately. Of course, 
the internal energy of an ideal rigid body ts all rotational, but a more realistic 
mode! of a solid body is obtained by including the other terms in (1.39). The 
vibrational energy of a solid is manifested in thermodynamic as well as elastic 
properties of the solid. 


Internal Energy and Work 


To determine how the kinetic energy K evolves with time, wc differentiate it 
and use the equation of motion (1.1); thus, 


eee Si! eee aly a oe cae mee 7 = 
= a 4 yMNX;) = ° x;-(mX;) = > (F; 2 fi) X;. 
By virtue of the 3rd law (1.2), 
- 2 FOS = 72 = St ee > ~ f(x; — X;) 
= re f,;T jj aor 2 fF : 


where 


= ee el Oe (1.40) 


User 


Hence, changes in the total energy are determined by the equation 
K = YF x, + > f,%;. (1.41) 
i Lay 


Equation (1.41) can be formally integrated to get 


AK = K(t,) — K(t,) = = | F-dx; + 2 f;-dr;,. (1.42) 
1 1 


raps 


The limits on the integrals have been abbreviated; for example. 
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i eds = | OS ee (1.43) 
1 x;(t;) 


This integral quantity is called work, specifically, the work done by external 
forces on the ith particle during a displacement from the position x,(t;) to the 
position x,(¢,). The right side of (1.42) is the total work on the system, 
consisting of the sum of the works done on each particle by external and 
internal forces. Thus, the mathematical equation (1.42) can be expressed in 
the following words: For any system of particles, in-a specified time interval 
the change in total kinetic energy is equal to the total work done on the system 
by external and internal forces. This is the general work-energy theorem. A 
special case of the theorem is more useful in practice, so we turn to that next. 

If the internal forces are conservative, then they are derivable from a 
potential energy function V, that is. 


fj=-V.V. 


We have already assumed that the internal forces are central, and we know 
from Section 4-5 that a conservative central potential can be a function only 
of the distance between particles. Hence, 


[ed LU he eee) (1.44) 


where r, = |r, | = | t;—18,| = | x, — x, |. In most applications the potential is 


a sum of 2-particle potentials V,; like so, 
V= >} ViAr;). 
i<j 
However, this stronger hypothesis is unnecessary for present purposes. Now, 
by the chain rule, 
aV 


fi =— V -) — , 
uy ( a) ar; 
Hence, 
: ; OV oV 
il iaags (tiyVefi) a Oy rea 
i ij 
and 
6 to dV 
roof. =- > Ss SS 1.45 
eu 7 ey < ar, dt ( ) 


Using this in (1.41) and separating the kinetic energy into external and 
internal parts, we obtain 


KytE=2E x, (1.46) 
Ll 
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where 


E=K,,+V (1.47) 


int 


is the total internal energy of the system. Thus, (1.46) describes the rate at 
which the internal energy is altered by external forces. In integral form, 


AK-y + AE == i; F-dx,, (1.48) 
! | 


it describes the change in energy resulting from work on the system by 
external forces. Equation (1.48) is the most useful version of the work-energy 
theorem. 

We can, however, separate the changes in external and internal energies. 
From the CM equation of motion (6.10) we get a separate equation for the 
external kinetic energy: 


Key =“ (GMX) = MKX = PX. (1.49) 
Substituting this into (1.46), we obtain 

C= ZF /%; — FX = : F,-(x; - X) = ZF). (1.50) 
Or, in integral form, 


AE = E| Fed, (1.51) 
I il 


This looks simpler than (1.48); however, the integral here is not usually as 
convenient as the total work integral in (1.48). 

The theorem that the energy of an isolated system is conserved tollows 
trivially from (1.48). But it has nontrivial consequences. For one thing it helps 
us formulate the concept of work correctly. From (1.48) alone. one might 
interpret the work integral on the right as a measure of energy production. 
However, the correct interpretation is that work is a measure of energy 
transfer. To establish this, we separate the entire universe into two parts, the 
system of interest and its environment, the rest of the universe. The universe 
is isolated, so its energy is conserved. The energy of the universe can only be 
redistributed among the parts of the universe. Therefore, (1.48) must de- 
scribe an exchange of energy between the system and its environment. 


The First Law of Thermodynamics 


Thermodynamics is concerned with the transfer and storage of energy among 
objects. Statistical Mechanics is the branch of physics concerned with deriving 
the laws and equations of thermodynamics from the properites of atoms and 
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other particles composing macroscopic objects. In a word, statistical mechan- 
ics aims to reduce thermodynamics to mechanics. Let us consider, in a 
qualitative way, the mechanical basis for some fundamental thermodynamic 
concepts. 

The number of atoms in a macroscopic object is immense, about 10” in a 
golf ball, for example. It is not only impossible to keep track of individual 
atoms, it is undesirable, because the glut of information would be unwieldy. 
The best that can be done is to express the internal energy and the work done 
on a macroscopic object as functions of a few macroscopic variables, such as 
the volume of the system, while microscopic variables are controlled only 
partially and indirectly. It is the job of statistical mechanics and thermodyn- 
amics to specify precisely how this is done. However, the result is a separation 
of the work done on a system into two parts, the work —W done by altering 
macroscopic variables and the remainder of the work Q due to changes in 
microscopic variable. Thus, 


= [Fé = O- Ww. 
i 1 
So the work-energy theorem (6.48) is given the form 


AK, + AE=Q-W. (1.52) 


This is the famous first law of thermodynamics. The term K,,, is often 
carelessly neglected in the statements of this law. But it is essential in some of 
the most elementary problems, for example, in determining the rise in 
temperature of a sliding block as frictional forces reduce its translational 
kinetic energy. 

The negative sign appears in (1.52) because, as is customary in thermody- 
namics, W denotes the microscopic work done by the system rather than the 
work done ‘“‘on”’ the system. The term Q is commonly referred to as the heat 
transferred to the system, thereby perpetuating in language the old miscon- 
ception that “heat” is a physical entity of some kind. Rather, heat transfer is a 
particular mode of energy transfer. It would be better to refer to Q as 
microscopic work to indicate what that mode is. 

There are two distinct ways that microscopic work is performed. The first is 
by ‘‘thermal contact’’; when two macroscopic objects are in contact, enei -y is 
transferred from one to the other by interactions among atoms at the surface 
of contact. The second is by ‘radiation’. Electromagnetic radiation (light) 
may be emitted when atoms in the system collide or, according to quantum 
mechanics, by a spontaneous process within a single atom. Emission and 
absorption of light involves energy transfer between the system and the 
radiation field, that is, a mode of work. 

In thermodynamics, when macroscopic parameters are held fixed, the 
internal energy of an object is expressed as a function of a single variable, the 
thermodynamics temperature. The temperature variable can be identified 
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with the internal energy per particle in a perfect gas. A perfect gas is a system 
of noninteracting identical particles. Hence, all of its internal energy is 
kinetic, and the temperature T of the gas can be defined by 


try 


N 
AT Sage Lm (1.53) 


= 
Ne 
where & is known as Boltzmann’s constant. The specific value of Boltzmann’s 
constant is of no concern to us here. Boltzmann’s constant is merely a 
conversion factor changing the temperature unit into the energy unit; it is a 
relic of times before it was realized that temperature is a measure of energy. 

The internal energy of a perfect gas provides a standard to which the 
internal energy E of any other macroscopic object can be compared. This 
leads to an expression E = E(T) for the object’s internal energy as a function 
of temperature. The function E = E(T) compares the energy of the object to 
the energy of a perfect gas under “equivalent conditions”. To be sure, the 
perfect gas is an imperfect model of a real gas, just as the rigid body is an 
imperfect model of a solid body. Nevertheless, the perfect gas provides a 
theoretical standard for measurements of internal energy, just as the rigid 
body provides a standard for measurements of length. 


Open Systems 


So far we have considered systems composed of a definite set of particles. 
Such systems are said to be closed. An open system is one which is free 
to exchange particles with its surroundings. It can be defined as a set of 
particles within some specified spatial region, usually a region enclosed by the 
boundaries of some macroscopic object. All macroscopic objects are open 
systems, but often the rate at which they exchange particles with their 
surroundings is so small that they can be regarded as closed systems. To 
handle systems for which particle exchange is significant, our theorems for 
closed systems must be generalized. The full generalization is best carried out 
within the domain of continuum mechanics, so only a special case will be 
considered here. 

Suppose that in a short time Ara body with small mass AM and velocity U 
coalesces with a larger body of mass M and velocity V, as shown in Figure 1.1. 
This is a inelastic collision, so energy of the macroscopic motion is not 
conserved; most of it is converted to internal energy of the body. Let us 
suppose also that the collision imparts no significant angular momentum to 
the body. Now, mass is conserved in the collision, and, in the absence. of 
external forces, so is momentum. Hence, 


MV + AMU = (M+ AM)(V + AV), (1.54) 


where AV is the change in velocity of the larger body. which we regard as an 
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Before impact After impact 


Fig. 1.1. Inelastic collision. 


open system. We can pass to the case of a system accreting mass continuously 
by dividing (6.54) by Ar and passing to the limit Ar = 0. Thus, we find 


MV + M(V-U) = &(Mv)- MU = 0. (1.55) 
Evidently this equation generalizes to 
(Mv) =F + MU (1.56) 


if the system is subject to an external force F. 

Equation (1.56) describes the rate of momentum change in an open system 
(on the left side) as a result of momentum transfer from its surroundings (on 
the right side). The momentum transfer is accomplished in two ways, by the 
action of an external force and by momentum flux. The term MU is called the 
momentum flux, because it describes the rate that momentum is carried into 
the system by particles crossing the boundary of the system. 


Example 1.1 


As an example of an open system, consider a raindrop falling through a 
stationary cloud. It will accrete mass at a rate proportional to its velocity V 
and cross-sectional area mr°. For spherical raindrops, the mass M is propor- 
tional to r*, hence, the rate of accretion is described by 


M =cM™V, (1.57) 


where c is a constant. If x is the displacement of the drop in time ¢, then V = x 
and M = V dM/dx, so 


The variables are separable, so we can integrate to get the mass as a function 
of displacement: 


(E 


Nem =e : (1.58) 


| 
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where M, is the initial mass. 
In accordance with (1.56). if resistive forces are neglected, the equation of 
motion of a vertically falling raindrop is 


d . 


Since M is given as a function of x by (1.58), a change of the independent 
variable from ¢ to x is indicated. Multiplying (1.59) by M, we obtain 


d Gl 2 ap? 


Hence, 


M?V2—M2V2= 26 M? dx. (1.60) 
( 


This is readily integrated after inserting (1.58), but the result is much cleaner 
if we neglect M, in relation to M, whence 


3 x 2g 
VS 2 xede= =x’. 
0 7 
sO 
Veet. (1.61) 


Differentiating with respect to time, we get 


ae sa 
Soa (1.62) 


from which we can easily find the time dependence of the motion. 


Example 1.2 


Rocket propulsion affords another example of open system dynamics. In this 
case mass is expelled from the system instead of accreted, so (1.56) applies 
with M negative instead of positive. For a rocket in a uniform gravitational 
field, the equation of motion (1.56) can be put in the form 


MV = M(U-V) + Mg. (1.63) 
Here the vector e = U- V is the exhaust velocity. the average velocity of 


exhaust gases relative to the rocket. For constant e, (1.63) integrates to 


V-V, = -e log <7 oe, (1.64) 


(} 


where M,, = M(Q) is the initial rocket mass. Of course the time dependence of 
the mass M = M(r) is determined by the burning rate M programmed in the 
rocket. Once this is specified. the time dependence of the velocity is deter- 
mined by (1.64), and the displacement of the rocket can be found by direct 
integration. 

The reader should be cautioned against the mistaken assumption that the 
velocity V in our general equation of motion (1.56) is necessarily equal to the 
center of mass velocity X. To understand the difference between V and X, 
consider the momentum of the system. 


MV = & mx. (1.65) 
Ll 


If internal motion is negligible then the system can be regarded as a rigid body 
and every particle has the same velocity x; = V. However, from an open 
system like a rocket, particles are suddenly expelled, with consequent shifts in 
the rocket’s center of mass. Thus, the difference between X and V is due to 
motion of X within the body as a result of mass flux through the boundary. 
Therefore, the error made by computing displacement under the assumption 
that X = V cannot exceed the dimensions of the body. Usually we are more 
interested in the motion of particles in a body than in the motion of center of 
mass, so V may be of more interest than X. However, for a system with 
significant internal motion, such as a spinning body, we cannot identify V with 
the velocity of the individual particles, so the relation of V to X is important. 


6-1. Exercises 
(164) Justify the interpretation of force as the rate of momentum transfer 


from one system to another and of torque as the rate of angular 
momentum transfer. 


(4) For a closed system of particles with internal energy E in a conserva- 
tive external field of force F(x) = —-V V(x), show that the total 
energy 


1 MX? + . V(x) + E 

is a constant of the motion, reducing to 
MX? + MX-g+E 

in a uniform gravitational field. 


(iS) A uniform flexible chain of mass m1 and length a is initially at rest on 
a smooth table with one end just hanging over the side. How long 
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will it take the chain to slide off the table and how much energy has 
been dissipated in this time? What will its velocity be when it loses 
contact with the table? 

(1.4) A rocket launched from rest is programmed to maintain a constant 
exhaust velocity e and burning rate M = —k until its fuel is ex- 
hausted. The mass of the fuel is a fraction f of the initial rocket mass 
M,. Neglecting air resistance and assuming a constant gravitational 
acceleration g, show that the maximum height A attained is 


1 “ a “ M, 
h a Foal log hf) ee if + log lair 
(5) For an open system with momentum MV, continuously accreting 
mass with momentum flux MU, and ejecting mass with momentum 
flux -M_U , show that momentum conservation leads to the equa- 
tion of motion 


MV =F +M,(U,-V)-M(U_-V), 
while mass conservation gives 
M=M,-M.. 


(1.6) An open-topped freight car of mass M is initially coasting on smooth 
rails with speed v,. Rain is falling vertically. 
(a) Determine the speed vu of the car after a mass m of rain water 
has accumulated in the car. 
(b) If the water leaks out as fast as it enters, determine the speed of 
the car after a mass m of rain water has passed through it. 


6-2. The Method of Lagrange 


The bookkeeping required to describe and analyze the motion of an N- 
particle system can often be simplified by a judicious choice of variables. 
Lagrange developed a systematic method for describing a system in terms of 
an arbitrary set of variables. In Section 3-10 the method was explained in 
detail for a 1-particle system, and its generalization to an N-particle system is 
straightforward, so we can treat it concisely. 

In the Newtonian approach, an N-particle system is described by specifying 
the position x, = x,(t) of each particle as a function of time. To describe the 
system, instead, in terms of some set of scalar variables { gq, = q,(t); a = 1, 


2... ., nm}, we must express the positions as functions of the new variables, 
Xi = X(Qi Goo + + ++ Ans f), (2.1) 
where: = 1,..., N. The scalar variables q, are called generalized coordi- 


nates. 
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The set of particle positions {x,,x,, . . ., Xy} 1S called a configuration of the 
system. Since each of the N position vectors x, is a vector in a 3-dimensional 
space, it takes n = 3N generalized coordinates qg, to specify all possible 
configurations of an N-particle system. Therefore, the q, are coordinates of a 
point q = q(t) in a 3N-dimensional space. This space is called configuration 
space. The motion of an entire system 1s thus described by a single trajectory 
q = q(‘) in the 3N-dimensional configuration space instead of a set of N 
trajectories x, = x,(¢) in the 3-dimensional position space. This helps us apply 
our intuition and knowledge of the dynamics of a single particle to the 
dynamics of a many-particle system. It is an important conceptual advantage 
of Lagrange’s method. 

It may happen that the position variable x, are related by holonomic 
constraints specified by K scalar equations 


p,(X,, X2, = ag Xy3 t) = 0, (2.2) 


where J = 1,2,...,K. We saw in Section 3-10 that a holonomic constraint 
on a single particle confines the particle trajectory to a 2-dimensional surface 
in position space. Similarly, for an N-particle system each equation of con- 
straint (2.2) determines a (3N — |)-dimensional surface in configuration space 
and confines the trajectory q = q(t) of the system to that surface. The set of K 
constraints confines the system to a (3N — K)-dimensional surface. Conse- 
quently, we can use the equations of constraint to eliminate K variables and 
specify the system by N = 3N — K independent generalized coordinates 
Ge. . 8 q,- Such independent coordinates are sometimes called degrees of 
freedom, son = 3N - K is the number of degrees of freedom of the system. 

Our problem now is to convert the Newtonian equations of motion in 
position space to an equation of motion for the system in configuration space. 
Newton’s equation for the i-th particle can be put in the form 


a Ve f, + N,, (253) 
where V = V(x,, X.,...., Xy, f) iS the potential for conservative forces, 
f, = 4(%,, «.~ + , Xvi Kiso + « > Ky. £) 1S the fonce function for noncomservative 


forces, and N, is the resultant force of constraint. 

The J-th equation of constraint (2.2) determines a constraining force 
1,V,,?,, which can be interpreted as the force required to keep the :-th 
particle on the J-th surface of constraint. The resultant constraining force is 
therefore 


Noe > NV Ps Ns (2 1,,) « (2.4) 


The equations of constraint @, = 0 are among the givens of the problem, 
however the scalars @, = $,(x,,..., Xy, ¢) are among the unknowns and 
must be obtained from the solution if the constraining force is to be found. 

Since the qg, are taken to be variables independent of the constraint 
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equations, we must have 0,_.@, = 0, which by application of the chain rule, 
gives us K equation 


== 0) 
digas = EL = Z(,X)-Vah = 0. (2.5) 


a 


These equations can be used to eliminate the constraining forces from the 
equations of motion; for 


D> N,-(0,,%:) 7 2 A, 2 (0,,%i)'VxPs = Us 
I t 


hence, from (6.4) we get 


>> (m;X; ar oe f;)°(94,%:) = 0, (2.6) 
I 


The nm equations (2.6) can be re-expressed as equations of motion for the qg, 
by employing the chain rule for differentiation. 
Thus, from (2.1) we obtain 


x; => 2 GadqXi + 0,X;. (257) 


Differentiating this, we establish 


04%: = Og Xi» 
: d 
7 x) = “ay (OaaXi) 
Hence, the first term in (2.6) can be brought into the form 
= d ; d 
2 m; Xj * (Gq ,%i) = 2 Par (m,%;* (4g,Xi)) — WX," & sum} 
= ¥ {d4,(¢ mx?) - 44, (x m:x7)} 

L 


d 
= war (84qK) Pag aK (2.8) 


K = 2 4m;x? (2.9) 


is the total kinetic energy of the system. 
By substitution, the potential can be expressed as a function of the general- 
ized coordinates 


VORGas 6 Und Doo AWG sr Mn Y= WEG, «Gast. (2.10) 
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Hence, the second term in (2.6) becomes 


= (0,,Xi)' Vx,V = - 44,V. (1) 
This can be combined with the first term (2.8), giving 
, d 
= (m;%; + Vx,V) + (d,%:) = ee) =o Qi) 


where we have introduced a new function 
L=K-V = LQ so 5 GntGis » - Gn D (2.13) 


called the Lagrangian of the system. 
For the third term in (2.6) we introduce the notation 


DA oO Unis ns et ene, X;)., (2.14) 
t 


The quantity F,, is called the g,,-component of the generalized force. The right 
side of (2.14) shows that F,, can be interpreted as the component of force on 
the system in the “direction” of a change in q,; this ‘‘direction”’ is a direction 
in configuration space rather than position space. 

Using (2.12) and (2.14), we get (2.6) finally in the form 


(8,2) ~a,,L = F, (2.15) 


where a = 1,...,n. These are called Lagrange’s equations for the system. 
These are the desired equations of motion for the system in terms of general- 
ized coordinates. 

Before Lagrange’s equations can be used, the Lagrangian and the general- 
ized force must be expressed in terms of the generalized coordinates. For the 
potential this is done by simple substitution, as in (2.10). To do it for the 
kinetic energy, we must use the chain rule; thus, from 


K = z 2 (2 Ga9qXi + Opa). 
l a 


we get the kinetic energy K = K(q,,.. ., ni Gis - + +> Mp3 f) in the form 
K = Mos do da + DE Buda + C, (2.16) 
a. p a 
where 
Mag = 2 (0,.%i) ; ox) = M.(4: 5 2+ 99 ns D. (2.17a) 
i 


B, = 2 (0.x) ‘ (0,X;) = B.Aq:, sy Ans ae (2.17b) 
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C= eo x) SC Cm): (2.176) 
L 


The explicit time-dependence of x; = x,(q,, . - - , Yn} t) results only from time 
dependent constraints. Consequently, for time independent constraints (or 
no constraints at all) the kinetic energy assumes the form 


K= s 2 Map(Qi, Od 8) Gn)Gap > (2.18) 


where M,, are to be obtained from (2.17a). 

By deriving Lagrange’s equations, we have carried out once and for all the 
steps required to introduce any set of coordinates into Newton’s equations 
and eliminate the holonomic forces of constraint. From now on we can avoid 
those steps by constructing Lagrange’s equations straightaway. This is Lag- 
range’s method. 

Lagrange’s Method can be summarized as a series of steps for attacking a 
given dynamical problem: 


Step I 


Express any holonomic constraints in parametric form by determining the 
particle positions x, as explicit functions x,(q,, . . ., g,; f) of an appropriate set 
of independent generalized coordinates. Diagrams are often a valuable guide 
to the selection of coordinates. Sometimes it is best to begin with dependent 
coordinates and then eliminate some of them by applying constraints in the 
nonparametric form (2.3). The best set of coordinates is usually determined 
by symmetry properties of the potential energy function, but it may be 
difficult to find. 


Step I] 


Express the Lagrangian and, if needed, the generalized force as explicit 
functions of the generalized coordinates q, and their velocities q,. 


Step II 


Solve Lagrange’s equations. The equations may be quite complicated even 
for fairly simple problems. No single mathematical method suffices to handle 
all problems. 


Example 2.1 


Atwood’s Machine consists of a pair of “‘weights’’ connected by an inextensi- 
ble string passing over a pulley as shown in Figure 2.1. In this case the 
parametric equations for the positions of the weights are so simple that it is 
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Fig. 2.1. Atwood’s Maching. Fig. 2.2. The double pendulum. 


unnecessary to write them down. We choose the vertical displacements x, and 
x, for generalized coordinates. So, neglecting friction and the masses of the 
string and pulley, the kinetic energy of the system is 
while the potential energy is 

V =i mage — mm, BX, . 
Since the string is inextensible, the coordinates are related by 

5 Alt ac heat Gm 


where C is a constant. Using this to eliminate one of the coordinates, we 
obtain the Lagrangian 


L= rales a Pes ae (m, a m,)gx, as m, gC. 


Inserting this in Lagrange’s equation 
d 
— (0, L)-0, L = 0, 
dt ( x; ) B a 


we obtain the equation of motion 
(m, + m,)x, = (m, — m,)g. 


Note that the direction of the acceleration depends on the relative magnitudes 
of the masses. 


Example 2.2 


The double pendulum consists of one simple pendulum attached to the end of 
another. This is a two particle system subject to rigid constraints. Let the 
parameters of the system be specified by Figure 2.2. The plane of oscillation is 
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specified algebraically by a unit bivector i. With angles ¢, and @, as general- 
ized coordinates, the parametric equations for the positions of the particles 
are 


2G = l.ge'® 
x, = &(Lel™ + Lel*). 
Whence, the kinetic energy is put in the form 


1 Pie) ] Pie) 
K = g9A xX, + adie 


= y(m, + mip; + mA; + 2hLG,$, cos(. — $,)). 
The potential energy is given the form 
V =-—m,g-x, — m.g°x, 
= — g(m,le* + m(le + Le”), 
= —(m, + m,)gl, cos @, — mgl, cos ¢,. 


Forming the Lagrangian L = K — V and substituting this into Lagrange’s 
equation 


d 
ay (ee L)- Oo, L=0, 


we obtain 


d , 
“dt. [(m, ot m,)li9, as mLL¢, cos(?, = ?,)] 
a (m, ae m,)lg sin g, + m,Ld,9, sin(@, a oe) = 0. 
In a similar we obtain 
d ' , 
“dt [m 130, 5 m,LL¢, cos(@, a ?;)] 
+m,l,gsin @, — m1,1,0,9, sin(@, — —,) = 0. 
For small oscillations, these equations reduce to 
(m, Fm) AGH ml, + (m, + m,)lL.g¢, = 0, 
m413,+ rbd, + m,l,g¢, = 0. 


These are equations of motion for a pair of coupled harmonic oscillators. 
They can be solved by a change of variables that decouples the two equations. 
A systematic method for doing this will be developed in Section 6-4. 

Now suppose we wish to account for the effect of air resistance on the 
motion of a double pendulum. The resistive torce on each particle is propor- 
tional to its velocity, that is, 
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fey, tee pee 


where fu, and u, are positive constants. Therefore, the components of the 
generalized force (2.14) take the form 


F, = f,-(05,x:) + £°(0,%2) 
os 1, 0,(99,%1)° = 1,(9199,%2 Sy 29.g,%2)"(Ig,%:) 
= — plip, - Llib, + Lhd, cos(p, - 9), 

Fy = — (914g,%) + b249,%)"(4g,%2) 
= — u(l,l, cos(, - b,) + H@.). 


Including these in the equations of motion, we get, in the small angle 
approximation, 


(m, + m,)li@,t+ mL 1Lo, + (m, + m,)led, = — (tu, + ,)l?p,- Ml Ld, 
m 30, 26 mayb, + mLgo, = - tol, L, = ulsd,. 


The solution of these equations is also discussed in Section 6-4. 


Example 2.3 


Consider a system of two particles connected by an inextensible, massless 
string of length a passing through a small hole in a table, as shown in Figure 
2.3. Adopting polar coordinates r, 6 in the plane of the table, the parametric 
equation for the position of the particle on the table is 


xe =eewie’”. 
Utilizing the constraint, the position 


- of the suspended particle is specified 
by 


x, = Ba — 1). 


Consequently, the kinetic energy of 
the system is given the form 


K = 4m(#? + 7°82) 


Fig. 2.3. A particle subject to a central force of 
constant magnitude. ate + mF ?, 


and the potential energy is 
V =— mg-x, = mg(r—- a) 


sO, 


L=K-V=24(m, + mr? + +mr’6? — m,g(r - a). 
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Therefore, if friction is negligible, the equations of motion for the system are 


<-(,L) —8,L = (m, + m,)i —- m,r@? + mg =0, 


d_ (a. See 
> (@oL) - ag = F-(m,°8) = 0. 


The last of these equations tells us that the angular momentum / = myrO isa 
constant of the motion. Using this to eliminate 6° from the first equation, we 
obtain the radial equation of motion 


2 


f l 
(m, + m,) 7- a +m,g=0. 


It can be verified that 7 is an integrating factor for this equation by carrying 
out the differentiation in 


d 1 aS iG 
—~ 31 (m, + m,)F + ae = 0. 
Ay f (m m,)r See mer 


Thus we obtain another constant of motion, the energy of the system, and 
we see that the orbit of the particle on the table is that of a particle subject to 
a conservative central force. 


Ignorable Coordinates 


The last example illustrates a valuable general principle. Lagrange’s equation 
for the angle produced a constant of motion, the angular momentum, because 
the Lagrangian was independent of the angle. In general, for a conservative 
system, if 


Og ,L =O 


for some coordinate q,, then if follows trivially from Lagrange’s equation 
(2.15) that 


lige = Ogle 


is a constant of the motion. The coordinate q, is then said to be ignorable or 
cyclic, because it has the following properties exhibited by the angle variable 
in the last example: 

(1) The constant of motion P,, can be used to eliminate the q, from the 
remaining equations of motion and so effectively reduce the number 
of variables in the problem. 

(2) The coordinate qg,, must be a periodic (i.e. cyclic) function of time. 

(3) The condition 9, ,L = 0 results from some symmetry property of the 
potential energy (such as its independence of angle in the example). 

This suggests the possibility of a general method for choosing coordinates 
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to simplify the equations of motion, but we shall not pursue it. In Section 6-4 
we consider ways to use symmetry properties to help solve equations of 
motions of motion. The systematic study of symmetries in equations of 
motion is a major topic in modern theoretical physics. 


xX — 


Fig. 2.4. Block sliding down a 


moveable inclined plane. Fig. 2.5. A compound Atwood machine. 


6-2. Exercises 


(235) A block slides down the inclined plane surface of another block 
resting on horizontal plane, as in Figure 2.4. Assuming negligible 
friction, find the accelerations * and X of the coordinates indicated 
in the figure. 

(2:2) Find the accelerations of the masses m,, m,, m, in the compound 
Atwood machine of Figure 2.5. Neglect friction and the masses of 
the pulleys. 


m 


la 
Fig. 2.6. A sliding pendulum. Fig. 2.7. 
(2.3) A simple pendulum is suspended from a bead on a frictionless 


horizontal wire, as shown in Figure 2.6. Determine the equations of 
motion in terms of the coordinates x and @. 
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(2.4) Two masses on smooth inclined planes are connected by string to a 
massless pulley consisting of two rigidly attached spools with diame- 
ters in ratio 2 to 1, as in Figure 2.7. Determine the motion of the 
system. 

(295) Two simple pendulums are connected by a massless spring with 
stiffness constant k attached at a distance a from the supports as in 
Figure 2.8. The spring is unstretched when the pendulums are 
vertical. Determine Lagrange’s equations of motion for the system. 

(2.6) Two equal masses are suspended from identical massless pulleys as 
shown in Figure 2.9. Find their accelerations. 


m, 


Fig. 2.8. Coupled pendulums. Je, Ae 


6-3. Coupled Oscillations and Waves 


A rigid body model of a solid object consists of a system of particles with 
interparticle forces keeping the particles at fixed separations. A more realistic 
model accounts for deformations of a solid with internal forces allowing 
changes in interparticle separations. In an elastic solid the internal forces 
oppose small deformations from a stable equilibrium configuration. There- 
fore, by the general argument developed in Section 3-8, the interparticle 
restoring forces can be described by Hooke’s law even without more specific 
knowledge about interparticle interactions. Thus, we arrive at a model of an 
elastic solid as a system of particles attached to their neighbors by (massless) 
springs, in other words, a system of coupled harmonic oscillators. The 
mathematical formulation of the model consists of a system of coupled linear 
second order differential equations. The theory of small oscillations is con- 
cerned with the analysis of such models. Before undertaking a systematic 
development of the theory, in this section we study the simplest examples to 
gain familiarity with the basic ideas. 
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Two Coupled Harmonic Oscillators 


The main ideas in the theory of small oscillations appear in the simplest 
model, consisting of two identical harmonic oscillators with a linear coupling. 
Therefore, it will be profitable to study the ramifications of that model in 
detail. To have a specific physical realization of the abstract mathematical 
model in mind, consider 
an elastic string of two 
particles with fixed end- 
points, as illustrated in 
Figure 3.1. The particles 
are connected to each 
other and the endpoints 
by (massless) springs. 
When the particles are at rest at the equilibrium points, the string has a uniform 
tension 


Fig. 3.1. A pair of coupled isotropic harmonic oscillators. 


T = x8 = #,4,), (sel) 


where x, x,, are the force constants and a, a,, are the equilibrium lengths of 
the springs. 

The forces on the particles are described by Hooke’s law, so displacements 
q, and q, are governed by the equations of motion 


mq, = —*q, — H;2(G2- q), (3.2a) 


mq, = —xq, — %,.(q, -4,), (3.2b) 


We can decouple these equations by re-expressing them as equations for the 


“mean displacement” Q, = +(q, + q,) and the “relative displacement” 


2Q_ = q, — q. Thus, by adding the Equations (3.2a, b) we obtain 


Ont O20: = 02 (3.3a) 
where w* = x/m. The difference of Equations (3.2a, b) gives us 

Q + 07Q =0, (3.3b) 
where 

"(= (x + 2x,,)/m = af(1 + 2m,9/%). (3.4) 


We recognize (3.3a) and (3.3b) as equations for isotropic harmonic oscillators 
studied in Section 3-8, so we know that their solutions describe elliptical 
orbits and have the general mathematical form 


Q, =a, cosw,t+b, sinw,t, (3.Sa) 


Q =a_cosw_t+b_sinwt, (3.5b) 
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where a, and b, are constant vectors. Therefore, the displacements of the 
particles are superpositions 


qg.=Q2,+Q, q@=Q,-Q (3.6) 


of two harmonic oscillations with frequencies w, and w_. 

The particular combination of harmonic oscillations depends on the initial 
conditions. For example, initial conditions of the form q,(0) = q,(0) and 
q,(0) = q,(0) imply that Q (1) = 0 for all t, so. g,(t) = Q,(t) = q,(t). Thus, the 
two particles oscillate in phase with a single frequency w,. On the other hand, 
initial conditions of the form q,(0) = -q,(0) and q,(0) = -q,(0) imply that 
Q,(t) = 0, so q,(t) = Q (t) = -q,(t). Then, the two particles oscillate out of 
phase with frequency w_. 

Such collective oscillations with a single frequency of all particles in a 
system are called normal modes of the system. The frequencies w, and w_ of 
the normal modes are called normal (natural or characteristic) frequencies. 
The variables Q, and Q_ for the normal modes may be called normal 
coordinates, though the term is usually reserved for scalar variables as done 
below. For a two particle system of coupled oscillators the normal modes are 
of two types, the symmetrical mode with coordinate Q, and the antisymmetri- 
cal mode with coordinate Q_. 

According to (3.4), the frequency w, of the symmetrical mode is necessar- 
ily larger than the frequency w_ of the antisymmetrical mode. This is the 
simplest case of a general result: In a system with any number of linearly 
coupled oscillators, the mode with highest symmetry has the lowest frequency. 
In a mode of lower symmetry, the springs work against each other, increasing 
the effective restoring force and thus producing a higher frequency. ; 

The symmetrical and antisymmetrical modes are illustrated in Figure 3.2 
for longitudinal oscillations and in Figure 3.3 for transverse oscillations. The 
symmetrical longitudinal and transverse modes may be regarded as different 
normal modes since they are linearly independent. However, in the present 
model, their normal frequencies are equal. Linearly independent modes with 
the same frequency are said to be degenerate. In the present case, the 
symmetrical normal mode is said to have a 3-fold degeneracy, since the 
coordinate Q, can be expressed as a linear combination of a longitudinal 
mode and two independent transverse modes. Similarly, the antisymmetric 
normal mode is triply degenerate. 


rs i rn! 
a a a rr 


(a) (b) 
Fig. 3.2. (a) Symmetrical (in phase) and (b) Antisymmetrical (out of phase) longitudinal 
normal modes. 
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(a) 


Fig. 3.3. (a) Symmetrical (in phase) and (b) Antisymmetrical (out of phase) transverse normal 
modes. 


Equation (3.5a) admits as a special case of circular solution of the form 
O. a aoe", (3.7) 


where a, is a transverse vector in a plane with unit bivector i. If the plane is 
transverse, then the mode is illustrated by Figure 3.3a, where the two 
particles circulate in phase about their equilibrium points. This normal mode 
can be expressed as a linear combination of two orthogonal transverse modes. 
On the other hand, if the plane contains the equilibrium points of the 
particles, then the mode can be expressed as a combination of a longitudinal 
and a transverse mode. Conversely, the linear longitudinal and transverse 
modes can be expressed as linear combinations of such circular modes. Ot 
course, there are similar results for the antisymmetrical modes. Longitudinal 
circular normal modes are illustrated in Figure 3.4. 


(a) (b) 
Fig. 3.4. (a) Symmetrical and (b) Antisymmetrical circular normal modes in a longitudinal 
plane. 


Taken together, the coupled equations (3.2a, b) are linear in the pair of 
variables q, and q,; therefore, the superposition principle applies. This means 
that if we have two distinct solutions of the equations, then any linear 
combination of these solutions is also a solution. In particular, (3.6) tells us 
that any solution can be expressed as a linear combination of symmetrical and 
antisymmetrical normal modes which are themseives particular solutions. 
And each of these degenerate normal modes can be expressed as a combina- 
tion of three linearly independent longitudinal, transverse and/or circular 
modes. thus, we may select a set of six linearly independent normal modes 
and normalize them to unit amplitude (or energy). Let a,, (for = 1, 2, and 
r= 1,2,..., 6) be the normalized vectorial amplitude for the displacement 
of the nth particle in the rth normal mode. The pairs of vectors {a,,, a,,} 
compose a basis for the six dimensional linear space of solutions to Equations 
(3.2a, b). Therefore, any solution of the equations can be written as the linear 


superposition 
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6 
q,(t) Se >» a, O,(t) ? 


a) = & a, 00, (3.8) 


where the coefficients Q,(t) are scalar normal coordinates; they are harmonic 
functions which can be given the form Q,(t) = C, cos(w,t + 6,). The expres- 
sion (3.8) is called a normal mode expansion. It has the advantage of reducing 
apparently complex motions of the individual particles to the simple collective 
motions of the normal modes. A complete expansion into normal modes is 
not always appropriate. Often a partial expansion into degenerate modes with 
different frequencies is preferable. Thus, in the present case we prefer the 
partial expansion (3.6). 


Energy Storage and Transfer by Coupled Oscillators 


Since Hooke’s Law is a conservative force, the total energy E of our coupled 
two particle system is conserved. In terms of particle coordinates 


= +m(q? + 42) + (qi + G3) + +%,,(q, - 4)’. (3.9) 


The last term is the ‘‘coupling energy”, which may be regarded as residing in 
the connecting “‘spring’’, that is, in the ‘“‘mutual bond” between the particles. 
In the absence of external interactions, the energy E remains stored in the 
system. The coupling between oscillators allows for a transfer of energy from 
one to the other, but the energy in each normal mode is separately conserved, 
as Equations (3.3a, b) imply. Thus, we can write 


E= E, +E, (3.10a) 
where 
E=+M(Q2 + w.Q2). (3.10b) 


Thus, energy can be stored independently in each normal mode. 

In Section 3-9 we studied the motion of a harmonic oscillator driven by a 
periodic force without considering the possibility of a feedback effect of the 
oscillator on the driver. We saw that an oscillator can absorb and store energy 
supplied by the driver, so a feedback of energy should result from the action 
of the oscillator on the driver. The simplest example of such an effect is found 
in the present case of coupled oscillators. We can observe it by imparting all 
the energy initially to one of the oscillators. Thus, we strike the theoretical 
string by imposing the initial conditions q,(0) = v,, q,(0) = 0, q,(0) = 
q,(0) = 0. Then, from (3.5a, b) and (3.6) we get 


Q.(1) = = sin w.t (3.11) 


a 
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and 
7 yo sinw,t , sin wt (3.12a) 
wo, w. 
papas ( sino,t , sinwt | (3.12b) 
Ww, w. 


To picture the motion described by these equations, we look at the limiting 
cases of weak and strong coupling. 

For weak coupling (x,, << ), we can write the relation (3.4) in the 
approximate form 


‘wo =o,(1 + 28), (3.13) 


where € = x,,/2x < 1. In (3.12a, b), then, it is a good approximation to set 
w_ = w, in the denominators, but the small difference between w, and w 
cannot be neglected in the phases. Thus, we write 


Yo 
qi 2w 


sin w,t + sin wt 
+ 


u 


+ 


Vo 


[cos +(w, — w)t] sin +(w, + w)t 


ae 


R 


~ [cos €w,t] sin w,t, (3.14a) 


and similarly, 


R 


q@ [sin €w,t] cos w,t. (3.14b) 


+ 


The displacements are graphed in Figure 3.5 for ¢ = 0.1, showing the familiar 
phenomenon of beats, as the initial energy E = + mv; is passed back and forth 
between the oscillators. Each particle oscillates with frequency w,/27 while its 
amplitude is modulated with the lower frequency ew,/2m. The energy is 
transferred from one oscillator to the other in time ¢ = 2/2ew,. Complete 
transfer takes place even for very weak coupling. 

For strong coupling (w. >> w,), Equations (3.12a, b) are well approxi- 
mated by 


Vo 


@ = 5, Sin w.t= a. (3.15) 


Thus, a blow delivered to either particle causes the two particles to move 
together as a single rigid body. Most of the energy delivered by the blow is 
stored as center of mass energy of the two particle system. Only a small 
fraction of it is stored in the antisymmetrical mode as “internal vibrational 
energy” of the system. 
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Fig. 3.5. Beats in coupled oscillations. 


Note that the second oscillator can be treated as an ‘“‘external agent”’ acting 
on the first oscillator by inserting the explicit expression (3.11) for 
2Q = q, — @ into (3.2a) to get 


m(q, + w.q,) =asinwt, (3.16) 


where a is a constant vector. This is the equation for an undamped, driven 
oscillator studied in Section 3-9, where we say that through the driving force 
on the right the external agent feeds energy into the oscillator. However, if 
the agent is another oscillator, we have seen that when its energy is depleted 
the direction of energy flow is reversed. Moreover, in this case resonance 
cannot occur since the ‘driving frequency”’ w_ is necessarily greater than the 
oscillator frequency ,. 


One-dimensional Lattice Vibrations 


The simplest model of an elastic solid is a one-dimensional lattice (or string) 
of identical particles interacting linearly with nearest neighbors. This is a 
straightforward generalization of the two particle string we have just studied. 
Of course, the same model may be used to represent other physical systems, 
such as a string of macroscopic masses connected by springs or loaded on an 
elastic string. But it is of greatest interest in the theory of solids where, in 
spite of its simplicity, it has important physical implications. 

We consider an N particle string with fixed end points (Figure 3.6). The 
equilibrium positions x, of the particles are equally spaced with separation a 
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Fig. 3.6. Equilibrium positions for a 1-dimensional lattice of N identical particles. 


called the lattice constant. The string has length L = (N + 1)a, and x, = na 
forn = 1,2,..., N. If we limit our considerations to transverse or longitudi- 
nal vibrations, the particle displacements from equilibrium can be repre- 
sented by scalar variables q, = q,(t) = q(x,, ¢), where either the particle 
name n or the particle position x,, can be used to label the variables. 

For particles of mass m, the equations of motion for small displacements 
are 


mq, i WG — Ge) Fa ays (3.17) 


If this is to describe a solid, then the force constant x is a property of the 
material. If it is to describe a string under tension TJ, then x is given by 
x = T/a. In any case, it is convenient to write the equations of motion in the 
form 


Qn = Wo (Gn + 24n ne Gans (3.18) 
where n = 1, 2,...,N, and a = x/a. Boundary conditions at the ends of 
the string are imposed by writing 

Fo = Gnu = 9. (3.19) 


In a normal mode all particles vibrate with the same frequency, so to find 
the normal modes we look for solutions of (3.18) with the form 


Gn(t) =A,e™, (3.20) 


where A, = A(x, is constant. For the sake of algebraic convenience we 
follow the common practice of considering complex solutions and attributing 
physical significance only to their real parts. In that case, the unit imaginary 1 
has no specified physical interpretation, and we are free to suppose that it is 
the unit pseudoscalar. When the complex solution has been determined, we 
get the physical solution by taking it real (= scalar) part, written 


Re Qn = ‘Ando: (3.21) 


This trick of ‘“‘complexifying” the solution works because the equations of 
motion are linear, so the superposition principle applies. It is well to remem- 
ber, however, that there are situations where the entire complex solution has 
physical significance, as in the case of Equation (3.7), where the units 
imaginary is bivector for a plane in physical space. We will take advantage of 
this again later on. 
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Now, substituting the candidate solution (3.20) into the equations of 
motion (3.18), we get 


-a’A,, = We (AG _ Ze, ‘a An+1) . 4) 


This system of N equations is most easily solved by allowing A,, to be complex 
and considering a trial solution of the form 


A,, = Ae'"** = A(cos nak + isin nak), (3.23a) 
or, indexed by position, 
A,, = A(x,,) Ae" = A(cos kx,, + isin kx,), (3.23b) 


where A and k are scalars to be determined. Inserting this trial solution into 
(3.22), we obtain 


w= —w fe? gate ee 
2w? (1 — ka) 


il 


dud sin'( 4] | (3.24) 


Subject to this condition relating w and k, the real and imaginary parts of A,, 
satisfy (3.22) separately. However from (3.23) we see that only the imaginary 
part satisfies the boundary condition (3.19) at n = 0. So we introduce a 
notion for the imaginary part by writing 


1 i 
= —— — At) = —(A,+A4_,). : 
a, i (A,, A 2i ( n 2) (3 Ss) 
Consequently, 
a, = Asink,x = Asinnak. (3.26) 


The constant k is now determined by imposing the boundary condition at the 
other end point: 


ay,, =Asin[(N + l)ak] =0 (3:27) 


for integer r, this has solutions k = k, of the form 
(3.28) 


Equation (3.24) gives different values of w for different k,, so we rewrite it in 
the form 


= Se ; 2 
w, = 20, sin 5 | = 2a, sin oy 7 7" : (3.29) 


Similarly, Equation (3.24) gives different values of a,, for different k,, so we 
write 
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a,, = Asin k,x, = A sin (= (3.30) 
Equation (3.19) describes a different normal mode for each different normal 
frequency @,, so we write 


Gy ee. (3.31) 


-The scalar part q,,, = a@,, COS w,tis the displacement of the n-th particle in r-th 
normal mode. The real coefficient a,,, is the amplitude for the displacement of 
the n-th particle. We show below that there are exactly N distinct normal 
modes indexed by rin the ranger = 1,2, .. ., N. It will be left as an exercise 
to show that the censtant A in (3.30) has the value 


z 5 1/2 
a=(54,] (3.32) 


if the normal modes are normalized by the condition 
N 
= a, = 1. (3.33a) 
re 


Since the energy of a particle in harmonic motion is proportional to the 
square of its amplitude, this amounts to a normalization of the total energy in 
a mode. The normalization condition (3.33a) is a special case of the ‘‘ortho- 
normality relations” 

N 


N Zz : rn ; | sn 
be. eee = 3.33b 
2, Anns = TG 2, sin | weil "nad oe (3.33b) 


where [6,,] is the N x N identity matrix. The right side of this expression 
vanishes when r # s, thus describing a kind of orthogonality of normal 
modes; this result follows easily from a general argument given in Section 6-4. 

The above results from (3.28) to (3.31) completely characterize the normal 
modes of an N-particle string. For each normal mode, a wave form a,(x) with 
values for every x in the interval 0 < x < L is defined by 


a,(x) = A sin k,x = A sin om . (3.34) 
where 
ie oe (3.35) 
k r 


is the wavelength and k, is called the wave number of the mode. At the lattice 
points x = x,, the wave form gives the particle amplitudes a,, = a,(x,,). 
Normal modes for the case N = 3 are illustrated in Figure 3.7. 

To prove that an N particle string has exactly N distinct normal modes, we 
examine the Equation (3.30) determining the particle amplitudes a,,,. First 
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a); 
Third Mode (r = 3), A; = 2L/3 


Fig. 3.7. Normal modes of a three particle string. The wave forms are shown in dotted lines. 
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Fig. 3.8. Some unphysical wave forms for normal modes. 


note that all the amplitudes vanish identically when r = 0 or r = N + 1, so 
these values for r describe a string at rest. Then note that for r= N + 2, 
N+3,...,2N + 1, 2N + 2 the valuesiof the a,, are the samevas for r = 1, 
2,..., N+ 1, except for a trivial reversal of order and sign. This is illus- 
trated in Figure 3.8 for N = 3. Note that the wave forms in Figure 3.8 have 
twice as many oscillations as the corresponding forms in Figure 3.7, but these 
additional oscillations are physically meaningless, because they do not cor- 
respond to any difference in particle displacements. Thus, there is a small- 
est ‘‘physical wavelength” determined by the lattice constant, namely, 
Ay = 2L/N = 2a(N + 1)/N, or more simply A, = 2a for large N. By virture of 
(3.29), this smallest wavelength corresponds to a highest normal mode 
frequency w,,, called cutoff frequency of the system. Finally, to complete our 
proof we simply note that similar conclusions obtain for other integer values 
of r, positive or negative. 

We have identified and characterized a complete set of normal modes for 
the equation of motion (3.18). These are special solutions of the equation of 
motion subject to the boundary conditions (3.19). The general solution is a 
superposition of the normal modes. To represent it compactly, it is conven- 
ient to introduce complex normal coordinates Q, = Q,(t) defined by 


oO: = (Ce!) efor. = Cee oe ou (3.36) 
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where C, and 6, are scalar constants. As before, we attribute physical 
significance only to the real (i.e. scalar) part 


{O,)) = C, cos(w,t + 6,). (3.37) 


Now the complex general solution g,, = q,,(f) can be written in the form 
N 
CO) >> a,,O,(t), (3.38) 


where the amplitudes a,,, are given by (3.30) with A given by (3.32) or A = 1, 
as preferred. 

There are exactly N normal coordinates, and, according to (3.36), each of 
these depends on two constants, so the general solution (3.38) depends on 2N 
constants, as we know it should from general theory. These constants can be 
determined from the initial conditions by inverting (3.38) to express the 
normal coordinates in the particle displacements. The orthogonality relations 
(3.33b) make. this easy. Thus, 

Died, = > > Qe, = >S > AnsAn-O, ae > OFO. = Q,, 
n ig ren Le 


n 


proving that 


N 
Onl) = Aneta) (3.39) 
The constants are consequently by the 2N equations 
Ceo = 010) = Seawe.(0), (3.40a) 
n 
1 CeO (0 > aeg.(0). (3.40b) 
n 


Traveling Waves 


Each normal mode is a standing wave in the sense that its wave form (3.34) is 
time independent. It is also called a harmonic wave, because every particle in 
the mode oscillates harmonically with a single frequency. There are, how- 
ever, other harmonic solutions of the equations of motion (3.18) which do not 
satisfy the boundary conditions (3.19) for standing waves. With minor changes 
in our analysis for normal modes, it is readily verified that, for arbitrary 
constants A and 6, 


G(x; =e eo (3.41) 


is a harmonic solution of (3.18) provided k is related to w by (3.29). As 
before, we attribute physical signficance only to the scalar part 


(du(X, t))y = A cos(wt— kx + 8). (3.42) 
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This describes particle displacements at the positions x = x, = na. 
The function q,,(x, ¢) specified by (3.41) represents a traveling harmonic 

wave with velocity 

Ww 

v=—: 3.43 

: (3.43) 
To establish that, we write x = x’ + ut where x’ is the position coordinate of 
a reference system moving with velocity v in the positive x-direction. Substi- 
tuting this into (3.42) with (3.43) we obtain 


(u(x, t)o = A cos(kx' - 6). (3.44) 


As a function of x’, this is a fixed wave form. Therefore, we may regard it as a 
fixed wave form moving with velocity uv along the string. Thus, q,,(-x, £) 
describes a similar wave traveling in the opposite direction. Now note that for 
6 = 7/2, (3.41) gives us 


oda -x. f) ae — Aci kx) e'", (3.45) 


which is identical to the expression for a complex standing wave. Therefore, 
every standing wave can be regarded as a superposition of two traveling 
waves moving in opposite directions. 

When N is very large (as in the model of an elastic solid as a string of 
atoms), Equation (3.28) tells us that & is effectively a continuous variable. So 
we write the relation (3.29) between w and k as a continuous function 


w@ = +2, sin (<4) : (3.46) 
where the sign is chosen to make w positive depending on the value of k. This 
equation is called a dispersion relation for the following reason. Using it to 
eliminate k from (3.43), we see that it implies that the velocity of a traveling 
harmonic wave depends on frequency, specifically 


v = u(w) = —4Y —__.., (3.47) 


2 sin? | 
sin (2. 


Therefore, a “wave packet’ composed of harmonic traveling waves with 
different frequencies will collapse, because the component waves traveling at 
different velocities will gradually separate, that is, disperse. 

Dispersion relations are of the utmost importance in solid state physics, 
where they are used to describe many different characteristics of materials. 
The dispersion relation (3.46) is graphed in Figure 3.9. The two allowed signs 
for k correspond to wave propagation in opposite directions. The range of k is 
limited by the maximum value k = z/a corresponding to the minimum wave- 
length A = 2a/k = 2a and the maximum (cut off) frequency w = 2,. As we 
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—t/a 0 w/a 
Fig. 3.9. Dispersion relation for a monatomic lattice. 


determined earlier, waves of shorter wavelength (higher frequency) cannot 
be supported in a lattice. The limited allowed range for the wave number k is 
called the first Brillouin zone in solid state physics. The small slope of the 
dispersion curve near the boundary of the Brillouin zone implies that wave 
velocities vary rapidly with frequency thus producing large dispersion. 

On the other hand, at low frequencies (the long-wavelength limit), the 
dispersion relation (3.46) reduces to 

wo = 20, (44) = w,ak. (3.48) 

Thereiore, in this region harmonic waves with different frequencies have 
nearly the same velocity 


Uy = v(0) = ine = = Wa. (3.49) 


The dotted line in Figure 3.9 is the dispersion curve that would obtain if all 
harmonic waves had this velocity. The expression (3.49) enables us to calcu- 
late the velocity of waves in a medium from measured elastic properties. 
Recall that w;, = x/m, and x = T/a for a string under tension. Whence (3.49) 
yields 


ae Fe 


where @ = m/ais the linear mass density of the string. Similarly, for an elastic 
solid the elastic modulus Y, defined as the ratio of applied force to elongation 
per unit length, is a measurable quantity. Whence, x = Y/a, and the velocity 
of a low frequency transverse wave is given by 


ne (z)". (3.51) 
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It should be realized that we are talking about the velocity of waves with 
wavelengths much greater than the lattice constant. In this domain the waves 
are insensitive to the granular microstructure of the material, which may 
therefore be regarded as a continuous medium. 

We can model a continuous string as the limit of a string of discrete 
particles as mass m — 0 and particle separation a > 0 but 9 = m/a remains 
finite. To get an equation of motion for the continuous string, we index 
particles by their positions and write (3.18) in the form 


(et) | (Ce a) a) = ge sa) | 


le 
ar 4 a a 


Inserting in this the Taylor expansion 


Oe vised g 
ae Wi, 1) = ,oOta—+7a oes 
qx ta; 1) = qx, 0) at 7a ay? 3 
in the limit a > O we obtain 
Pq.) &q(x.t) 
ar = oe eae ae 6 (3.52) 
where vu, = lim w,a, in agreement with the long-wavelength result (3.49). 


Equation (3.52) is called the one-dimensional wave equation. The harmonic 
wave (3.41) is a solution of this equation, and its general solution is a 
superposition of such waves, all with the same constant velocity vy. 

As a final matter in our study of waves in strings, we note that all our results 
are easily generalized to the case where the particle coordinates are vectors 
representing displacement in three dimensions instead of scalar representing 
one-dimensional displacements, as we have already done for the two particle 
case. In particular, the expression (3.41) for a traveling harmonic wave 
generalizes to 


qatar *, (3.53) 


where now the unit imaginary i is a bivector and the constant vector ampli- 
tude a,, lies in the i-plane. We can construct normal modes from this by 
superposition as in (3.45), yielding the expression (3.7) found previously in 
the two particle case. 

In contrast to the scalar form (3.41), the imaginary part of (3.53) has a 
physical interpretation. This is easiest to picture when i is the bivector for a 
transverse plane. Then, (3.53) represents a circularly polarized transverse 
harmonic wave. To see that, note that for each fixed value of the particle 
coordinate x, (3.53) describes a displacement rotating with angular velocity w 
about the equilibrium point. Alternatively, for fixed t and variable x, (3.53) 
represents a helical wave form as pictured in Figure 3.10. As time varies, the 
helical wave form can be pictured as rigidly rotating about the axis of 
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Fig. 3.10. A left-circularly polarized traveling wave has positive helicity. 


equilibrium points, or moving rigidly without rotation along the axis with 
velocity v = w/k. If the wave form is a right-handed helix and the wave is said 
to have positive helicity. Unfortunately, in optics such a wave is said to be 
left-circularly polarized. Similarly, the function 


q..(-x, -t) = a,c ™ (3.54) 


describes a harmonic wave with negative helicity (right-circularly polarized) 
moving in the positive x direction provided both w and k are positive. 

It is important to note that both cases (3.53) and (3.54) can be lumped into 
one described by (3.53) if we allow the frequency to have negative as well as 
positive values. Thus, we can assign a physical interpretation to the sign of the 
frequency w for a harmonic wave: it is the helicity of circular polarization. The 
direction of wave motion is then determined by the sign of k relative to w. As 
we have noted, it is in the positive (negative) x-direction if v = w/k is positive 
(negative). All these considerations apply equally well to electromagnetic 
waves with q,,(x, ¢) replaced by the electric field vector E,,(x, f). 

In solid state physics the one-dimensional lattice model studied here is 
generalized to three-dimensional lattices of various types composed of atoms 
of various kinds. To analyse more complicated models such as these ef- 
ficiently, we need a systematic general theory of small oscillations, so to that 
we now turn. 


6-3. Exercises 


Gel) For a string of two identical particles coupled linearly, as in Figure 
3.1, suppose the string is plucked with the initial conditions 
q,(0) = A, q,(0) = 0, q,(0) = q,(0) = 0. Show that the displace- 
ments in time are given by 
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(3.2) 


(3.3) 


(3.4) 


q, = A(cos at) cos(ea,f) 
q, = A(sin wf) sin(Ew,t) 


where 2w, = w, + w_ and 2€w, = w._-—w,. Describe qualitative 
features of this solution in the weak and strong coupling limits. Why 
does the strong coupling result differ from the result in Equation 
(3.15)? 
For a string of three identical particles coupled linearly as in Figure 
3.6, suppose the string is plucked by displacing the center particle 
longitudinally with initial condition q,(0) = A, while g,(0) = 
q,(0) = 0 and g,(0) = g,(0) = g,(0) = 0. Determine the natural 
frequencies and the displacements q,,(t) as explicit functions of time. 
For a string of five identical particles coupled linearly as in Figure 
3.6, draw diagrams for the transverse normal modes from symmetry 
considerations alone and arrange them in order of increasing fre- 
quency. Then check your qualitative understanding by calculation. 
Derive the normalization factor (3.32) for normal modes by eval- 
uating the sum 

N in ( rn = N+1 

i N+1 a 


The following hints should be helpful. Write 
4 sin? n@ = 2(1 — cos 2n8) = 2 - (z" + z”), 


where z is a complex number obeying z**' = 1. The geometric 
series 


(oo) 
2" = (2) 
n=0 
can be iterated to get 


eo WMetezieeeee +... Fez) *, 


Ms 


n=0 


where k is any positive integer (consider k = 1 first). Then the finite 
sum 


can be evaluated by expressing it as a difference of two infinite 
sums. 
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6-4. Theory of Smail Oscillations 


The study of specific examples in the preceding section has provided us with a 
nucleus of ideas from which the general theory of small oscillations can be 
developed. A systematic formulation of the general theory has several bene- 
fits. First, it clarifies the range of physical problems to which the theory 
applies. Second, it organizes techniques for efficient problem solving. Third, 
it provides a conceptual framework for thinking about many particle systems 
as a whole. With the examples and concepts from the preceding section in 
mind, we can procede rapidly to a concise formulation of the theory. We 
develop the theory in a form suitable for a wider range of applications than we 
can consider here. Unfortunately, we do not have space for applications of 
the mathematical theory of groups to the general theory, a major topic in 
modern theoretical physics. But that can be found in the specialist literature. 

One should distinguish between mathematical and physical aspects of the 
theory of small oscillations. Mathematically, it belongs to the general linear 
systems theory, which is concerned with modeling the behavior of any system 
by systems of linear differential equations. Small oscillation theory deals with 
the case of deviations from a state of stable equilibrium. The analysis of any 
linear system can be carried out completely using techniques and results from 
the mathematical theory of linear differential equations. The well developed 
mathematical theory makes the analysis of any linear system a straightfor- 
ward task; it may be computationally complex when many variables are 
involved, but this difficulty has been largely overcome by the development of 
modern computers and computer software. 

The physical aspect of the theory of small oscillations concerns the physical 
interpretation of mathematical models. The same set of equations might be 
interpreted as a mathematical model for systems as diverse as an electrical 
network, a macroscopic system of springs and pendulums, or a microscopic 
system of atoms in a molecule or elastic solid. Our concern here will be with 
mechanical interpretations, specifically, with the mechanics of small displace- 
ments in a system of particles from a stable equilibrium configuration. Note 
that the term “small” refers to physical rather than mathematical aspects of 
the theory. It means that we are dealing with a linear approximations to 
nonlinear force laws, so the results have validity only for a range of states 
close to equilibrium. The equilibrium configuration is generally determined 
by nonlinear features of the force laws, so it must be taken for granted in a 
linear theory. 

For “large” displacements from equilibrium, the restoring forces are non- 
linear. The important subject of nonlinear oscillations has not yet been 
reduced to a theory with general results of wide applicability. It is currently an 
active field for research. Unfortunately, we don’t have the space here for a 
suitable introduction to the exciting recent developments in this field. 

Even the range of applications for the theory of small oscillations is too 


Theory of Small Oscillations 379 


broad to survey here. Our objective will be to develop the main ideas and 
general results of the theory along with examples to show how they are 
applied. We consider only discrete systems of particles. The generalization to 
continuous systems is a major topic in continuum mechanics which we 
touched only briefly in the preceding section. 


Harmonic Systems 


We will employ the Lagrange formulation of mechanics developed in Section 
6-2. We consider a conservative N-particle system with n degrees of freedom, 
so, with a suitable set of generalized coordinates q,, g.. . . .. q,,,. the interac- 
tions can be described by a potential energy function V = V(q, .. .. q,,). AS 
explained in Section 6-2, this includes the possibility that the system is subject 
to time independent external constraints. We assume also that the system has 
a State of stable static equilibrium at q, = q, =... = q, = 9. This requires 
some explanation. 

A system of particles is said to be in static equilibrium (in a given reference 
system) if all the particles remain at rest. This is possible only if the net force 
on each particle vanishes. In the Lagrange formulation, this equilibrium 
condition is expressed as a vanishing of the generalized force: 


K, =-9,,V(q, qo, ee, Gy) —tF (4.1) 


where a = 1,2,...,n.If the function V(q,, . . ., g,,) is known, then this is a 
system of n equations which can be solved for the equilibrium values q?°. 
However, we shall see that in some physical problems the equilibrium values 
are known while the function V is not. In either case the equilibrium 
condition (4.1) is satisfied, and we are free to adjust our coordinate system so 
that g? =... = q) = 0. Then the variables q, directly describe departures 
from the static equilibrium state. 

The effects of small departures from equilibrium can be described by 
approximating the potential with a Taylor expansion. Writing V(q) = 
Vig), + « <;°¢,,) swethave 


Vig (0) + 2920, V0) +2 22 gigo0, Vit... (4.2) 
a a £B 


We are free to choose V(0) = and the equilibrium condition (4.1) implies 
that the second set of terms on the right vanish. So, to a first approximation 
the potential is given by the quadratic function 


V(q) 4 z 2 KpGaG p> (4.3) 


where 
_ 0 V(0) 


any =k cos (4.4) 
Ogg 
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This is called the harmonic approximation and higher order terms in the 
Taylor expansion are said to be anharmonic. A departure from equilibrium is 
said to be “small” if the anharmonic contribution to the potential energy 1s 
negligible to the accuracy desired. 

In the harmonic approximation, the potential (4.3) corresponds to a gener- 
alized force 


K.(q) =~ dqVa(q) = ~ kaso: (4.5) 


The coefficients k,g, are called force constants; they are measures of the 
coupling strength between different degrees of freedom. 

The equilibrium point q = 0 is said to be stable if the quadratic potential 
energy (4.3) is positive definite, that is, if 


V(q) => z kapdadp.= 9 (4.6) 


for all values of the qg,. By setting all g, but one to zero, we see that (4.6) 
implies k,, > 0, so the corresponding generalized force —-k,,q, draws the 
system back to equilibrium. Actually, (4.5) is merely a sufficient condition for 
stability; in Section 6.5 we shall see that stability is possible under more 
general conditions. 

According to (2.18), in terms of the generalized coordinates the system 
kinetic energy K is a positive definite quadratic function of the q,: 


1K = + 2. Mapa = 0. (4.7) 


In general, the mass coefficients m,g = mag(q) are functions of the coordi- 
nates as well as the masses of the particles. But we can expand them in a 
Taylor series, 


magl@) = WaggO) + > adam). ., 
ie 


and, consistent with our approximation to the potential energy, we keep only 
the terms of lowest order. So we regard the mass coefficients as constants 


Mop = Mag(0) = Mga. (4.8) 


From the fact that every particle has a mass it follows that m,,, # 0 for all a. 

Now we form the Lagrangian L = K — V for the system from (4.6) and 
(4.7). and from Lagrange’s equation (2.15) we determine the n equations of 
motion for the system: 


X (Mapp + Kapdg) = Fa, (4.9) 
p 
where the F., are components of the generalized force due to external agents. 


To facilitate analysis, we write the system of Equations (4.9) as a single 
matrix equation 
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[m]| 4) + [A] q) =| F). (4.10) 
The mass matrix [m] is defined by 
My Mr * * * May 
ey, as, 
(m| = ; (4.11) 
My M2 Man 


and the matrix [A] is defined similarly. The notation |g) indicates the column 
of generalized coordinates defined by 


1 
q2 

1 |e (4.12) 
In 

Of course, 

qi F, 
Qo i r 

lq) =|. and |F) = 
Gn | i 

It is convenient to introduce the notation 
(q|=(qt.@t,..- 4%] (4.13) 


for the row matrix corresponding to the column matrix |g). The conjugation 
symbol is appropriate if we want to employ complex coordinates as defined by 
(Ge20)rand"(3:2))eThen 


(q|q@) oe lal? ie \q2l Tcers (4.14) 


where |q,,|? = gig,, and |g,|"= q2, if q, is real. Now the expressions (4.7) and 
(4.3) for kinetic and potential energies can be written 


2K = (q|[m]lq), (4.15) 
2V = (q|[k]|q). (4.16) 


Any physical system modeled by a matrix equation of motion of the form 
(4.10), where the matrices [m] and [k] are positive definite and [m] is 
nonsingular is called harmonic system, because it generalized the single 
particle harmonic oscillator model treated in Sections 3.8 and 3.9. Our 
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experience with the harmonic oscillator will serve as a valuable guide for the 
analysis of harmonic systems. 

The matrix notation has conceptual as well as computational advantages. 
Just as we represent the position of a single particle by a vector in position 
space, sO we can represent the configuration of a system of particles as a 
vector in the n-dimension configuration space. The notation |g) provides us 
with a symbol for the configuration vector and so helps us think of the system 
as a whole rather than the collection of its parts. However, the symbol |q) 
does not represent the configuration vector itself; it represents the matrix of 
components (or coordinates) of the configuration vector with respect to a 
particular basis in configuration space. The same configuration vector may be 
represented by a different matrix of coordinates |Q) with respect to a differ- 
ent basis. Thus, in the matrix formulation we deal with multiple rep- 
resentations of the same physical configuration. The advantage of this is that 
different matrix representations |q), |Q), |S), etc., of the configuration have 
different physical interpretations, and the notation helps keep track of this. It 
has the disadvantage of allowing ambiguities which can lead to confusion; for 
example, two different column matrices might be different sets of coordinates 
for the same configuration or coordinates for two different configurations. 

If the matrix equation (4.10) is to amount to more than a mere abbreviation 
for the set of equations (4.9), we need a system of theorems from matrix 
algebra to facilitate computations. The theorems we need are all straightfor- 
ward generalizations of results proved in Sections 5-1 and 5-2 for the 
3-dimensional case, so we take them for granted here without further proof. 
Actually, for the purpose of illustration, we shall not consider calculations 
with matrices of dimension greater than 3 X 3, because algebraic labor 
becomes so great that it is best performed by computers. 


Free Oscillations 


In the absence of external forces, the matrix equation of motion (4.10) 
reduces to 


[m]|4) + [A]lq) =0. (4.17) 


Our first task is to find the general solution of this linear matrix equation. The 
most straightforward approach is to consider a trial solution of the form 


lq) = |a)e (4.18) 


where w and the matrix elements of | a) are real numbers. Inserting this into 
(4.17) we obtain the matrix equation 


([k]- @[m})|a) = 0, (4.19a) 


which represents the system of scalar equations 
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‘ (Kap = W’Mgp)ag = 0 (4.19b) 


for the constants ag. The equations have a definite solution if and only if the 
characteristic equation 


det [kag- w’mag] = 0 (4.20) 


is satisfied. The determinant is a polynomial of degree n in the variable w’. Its 
Nn roots wz, are real and positive, because the matrices [k] and [m] are real, 
symmetric and positive definite, so a real, positive value for each w, is 
thereby determined. (This assertion can easily be proved as a byproduct of 
the alternative approach we take below.) Using the terminology introduced in 
the preceding section, we say that the roots w, are the normal or natural 
frequencies of the harmonic system. And if two distinct roots have the same 
value they are said to be degenerate. Each unique root is said to be nondegen- 
erate. 

For each root w, inserted into the matrix equation (4.19a), there is a 
solution | a,) of the equation, which is unique up to a scale factor if the root is 
nondegenerate. We have already discussed the matter of degeneracy in 
Section 6-3, so it will be sufficient to confine our attention here to the 
nondegenerate case. It is convenient to fix the scale of the solution by 
normalizing it with the condition 


(a,|[m]|a,) = 1. (4.21) 


Each ,a,) represents a normal mode with normal frequency w,,. 

The general solution |g) = |q(t)) of the matrix equation (4.17) can now 
be expressed as a superposition of normal modes |a,,) with normal coordi- 
nates Q, = Q,(t); thus, 


lq) =X Qa |aa)- (4.22a) 
Note that this can be put in the equivalent form 

la) = [a]|Q), (4.22b) 
where [a] =| |a,) Ja.) .. . |a,)] is a matrix with the |a,) as columns, and 


|Q) is the column matrix of normal coordinates. The normal coordinates can 
be given the explicit functional form 


Q(t) = C, cos(Wt + 5a); (4.23) 
where C,, and 6, are constants. Or if complex coordinates are preferred, 
O,(f) = (C,e*)e?"". (4.24) 


Note that Equation (4.22b) can be regarded as a relation between two 
different sets of coordinates |q) and |Q) for the same system. 
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The method we have just outlined for solving the equation of motion (4.17) 
may be called the brute force method, since it does not take advantage of any 
special information that might be known about the matrices [m] and [A]. 
There are alternative methods of solution which are simpler when certain 
kinds of information are available. For example, if the normal modes |a,,) 
can be determined first, then the normal frequencies can be most easily 
obtained from (3.19a), which yields 


_ (aal[k] a0) (4.25) 
(a,|[m]ia.) . 


Indeed, we used a variant of this approach in Section 6-3 to determine the 
normal frequencies of a one-dimensional lattice. As an example, it may be 
noted that the solution (3.38) for that system conforms to the general form for 
a solution given by (4.22a) and (4.23). 

Whatever method we use to find the normal modes |a,,), we still have the 
problem of evaluating the constants in our expression (4.23) for the normal 
coordinates. That is most easily solved by using the relation 


(a,|[7]|ag) = Oug» (4.26) 


which combines the normalization condition (4.21) with the orthogonality 
relation 


(a,|[m]|az)=0 if a#fB. (4.27) 
This relation can be proved by multiplying (4.19a) in the form 

we[m}|a.) = [k]}|aa) 
by (ag | to get 

2 (4g|[™]|aa) = (ag|[K] aa) 


Subtracting from this a similar equation with a and f interchanged and using 
the symmetry of [m] and [k], we obtain 


Wa 


(3, ~ o) (agl{m|a,) = 0. 
Since wi, # wg when a # B, this implies (4.27). 
Now from (4.22a), (4.23) and (3.26) we obtain 
C, cos 0, = (a,|q(0)), 
—WgC, Sin b, = (a,|q(0)), (4.28) 


which determines the constants in terms of initial data. 

Although the computational complexities of the brute force method have 
been largely overcome by computers, it is still worthwhile to consider alterna- 
tive methods for the insight they provide. Let us first consider how the 
equation of motion (4.17) might be simplified. 
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Since [m] is nonsingular it has an inverse [m]', so we might consider 
simplifying (4.17) by multiplication by [m]"', to get 


lg) + [K]lq) = 0, 


where [K] is a new matrix given by the matrix product [K] = [m] '[k]. The 
trouble with this procedure is that the product [c] of symmetric matrices [m]"' 
and [k] not necessarily symmetric, and we need to exploit the symmetry to 
solve the equation. Fortunately, the desired simplification can be achieved in 
a slightly different way. 

Since [m] is positive definite and symmetric, we know that it has a well- 
defined positive square root [m]'’. Indeed, if [m] is diagonal with matrix 
elements mag = M,0,,, then 


wo. . |. 0 
0 m'? 0 
[my = | 
oe 


In any case, let us multiply (4.17) with the inverse [m]"'” of [m]'” to get it in 
the form 


[mm]'"1g) = [mm] fA]lm] Yn ]!'q). 


Thus, the equation of motion can be put in the simple form 


lq’) -[k’]lq’) = 0 (4.29) 
if we write 

[A') = [mp "[k][m]"”," (4.30) 
and introduce a new set of coordinates q/, for the system defined by 

Iq’) = [m]""lq). (4.31) 


The q,, are sometimes called mass-weighted coordinates. 

It follows from (4.19) that the matrix [k’] is real, symmetric and positive 
definite, since [m] and [k] have those properties. Therefore, it has n distinct 
eigenvectors |b,,) with corresponding eigenvalues «,, which, of course, will 
prove to be the normal frequencies. Thus, we have 


[k']|b.) = w/b). (4.32) 
Defining a matrix [b] =[ |b,)/b,)...|b,)], these m equations can be 
written as a single matrix equation 

[k’}[5] = [5][a’], (4.33) 


where [w’] is the diagonal matrix 
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w 0 0 
0 w 0 
[w*] = 
0 0 ee. ie 
Since [b] is nonsingular, we can put (4.33) in the form 
[o’] = (6) Tk'TL4], (4.34) 


and we say that [b] diagonalized [k]. We can use this to further simplify the 
equation of motion (4.29) by introducing new coordinates | Q) defined by 
writing 


Iq’) = [6]/Q), (4.35) 
Substituting this into (4.29) and using (4.34), we obtain 

}Q) = [#*]|Q). (4.36) 
Since [@’] is diagonal, this is equivalent to n uncoupled equations 

Qu = Qa, (4.37) 


which have the general solution found before. 
These Q, will be identical with the normal coordinates defined before 
provided we related |b,) to |a,) by 


|ba) = [m]'"a,), (4.38) 
which imposes on the | b,) the normalization 
(ba|bg) = (a|[m]|ag) = Ong (4.39) 


To establish the relation to our previous result explicitly, we simply invert 
(4.31) and substitute (4.36) to get |g) = [a] |Q) with 


[a] = [m}""[] (4.40) 


Now we can identify [a] as the matrix which transforms our original equation 
of motion (4.17) into the diagonal form (4.36) for which the solution is 
elementary. 


Example 4.1 


To illustrate the general method, let us apply it to the double pendulum. In 
Example 2.2 of Section 6-2 we determined the linearized equations of motions 
for the double pendulum, which can be put in the matrix form 


[m]|@) + [A]lp) = 9, 
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where 

my Wee (m, + m,)l} Poe 

a = 
M,, My Wolds ml; 
k,, 0 | (m,+m,)Lg 0 

[k] = = ; 
Ok; 0 m,l,g 
9; 

lp) = ' 
Pr 


Since [k] is already diagonal in this case, it is algebraically simpler to 
introduce 


|e) = [kP"71¢") 
to put the equation of motion in the form 
19’) + [k']]9") = 0, 
where 
[’] = [A] Emp [k]", 
ra!) 
(kp? = 


0 ke 
and, by a result in Section 5-1. 
1 My, My 

a= aa | 


—M,, M,, 


where det [m] = m,,m,, — mM; = mmlil;. This reduces the problem to sol- 
ving the eigenvalue equation'(4.32). 


Vibrations of Triatomic Molecules 


The frequencies at which molecules absorb electromagnetic radiation depend 
on their normal modes. So the determination of molecular normal modes is of 
great importance for molecular spectroscopy. Here we analyze the normal 
vibrations of bent triatomic molecules such as H,O, SO, and Cl,O. Our 
classical analysis is a necessary prelude to the more precise treatment with 


quantum mechanics. 
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a bent symmetric triatomic mole- 
cule is shown in Figure 4.1. For 
H,O, the O-H bond length r and the 
valence angle 6 have the values 


The equilibrium configuration of | piano Axis 
| 
| 


r = 0.958 x 10° cm 
6 = 2 = 104.5° 


Fig. 4.1. Equilibrum configuration of a bent 
symmetric triatomic molecule. 


Let q,, q>, q, represent displace- 
ments of the atoms from their 
equilibrium position as indicated in 
Figure 4.2. In terms of these variables the kinetic energy K has the simple form 
ables the kinetic energy K has the 


2K = m,(qi + q3) + m.q3. (4.41) 


However, we are interested here only in vibrations and not rotations and 
translations of the molecule as a whole. So the 3 x 3 = 9 degrees of freedom 
of the three displacement vectors must be restricted. First, an internal vibra- 
tion cannot shift the center of mass, so we must require 


m,(q, + q,) + m.q; = 0 (4.42) 


Second, the vibrations must lie 
in the plane of the molecule. 
And third, the molecule must 
not rotate in the plane. Thus, 
after subtracting the three ro- 
tational degrees of freedom, we 
have 9 —~2 Xx 3 = 3 independent 
vibrational degrees of freedom 
remaining. This leaves us with 
the problem of choosing an appropriate set of internal coordinates. We can’t use 
normal coordinates until we have determined the normal modes. 

For direct physical description of the molecule, variations in the bond 
lengths, say R,, R,, R,, and variations in the valence angle, say Ro, are natural 
internal coordinates. Indeed, they provide the simplest expressions for the 
potential energy. The internal potential energy function is very difficult to 
calculate from first principles, and has been found for only a few simple 
molecules. So we must be content with estimating it from auxiliary assump- 
tions. The first assumption which recommends itself is that when the system is 
in a near-equilibrium state, the forces of attraction and repulsion between 
atoms are central. In that case, the potential energy will depend only on the 
bond lengths, with the specific functional form 


Fig. 4.2, Atomic displacements from equilibrium. 


Theory of Small Oscillations 389 


2V = k,(R2 + R2) + k,R°. (4.43) 


The constants k, and k, are unknown but they can be evaluated from 
spectroscopic data after the normal frequencies have been calculated. Since 
there are only two unknown constants, one of the three normal frequencies 
can be computed from data for the other two, thus providing a specific 
prediction of the model. It turns out that the predictions for various molecules 
are accurate to about 25%. Therefore, the central force assumption is only a 
moderately accurate description of internal molecular forces. 
Much better results have been obtained with a potential of the form 


2V = k,(R?2 + R2) + koR3. (4.44) 


the constant k, describes resistance of the main bonds to stretching while kg 
describes resistance to bending. The potential function (4.44) is also more 
reasonable than (4.43) because bending and stretching variations are orthog- 
onal to one another, as we expect of uncoupled variables in the potential 
energy function. It has the additional advantage of applying to ‘‘straight”’ as 
well as “‘bent’’ molecules. 

To use the potential energy function (4.44) along with the kinetic energy 
function (4.41), we need to relate the stretching and bending variables R,, R,, 
R, to the displacement vectors q,, q;, q;. To that end, it is convenient to 
represent the relative equilibrium position of the atoms by vectors r,, r, r; as 
in Figure 4.2. A variation in the bond length R, = Ar, is related to the 
relative atomic displacements Ar, = q, — q, by differentiating the constraint 
r; =r}. Thus, 7,Ar,;= r,Arj, so 


R, = Ar, = FAT; = Fg; = G5) 4 (4.45a) 
where f, = r,/r,. Similarly, 
R, = Ar, = f,Ar, = Gq): (4.45b) 


To relate a variation in the bond angle R, = A@ to the atomic displacement 
we differentiate 


tr > 7 re 

where i = 0,0, Is the unit bivector for the plane of the molecules. Thus, 
Arr, + r,Ar, = (r.Ar, + r,Ar, + r,riAd)e” 

Taking the scalar part of this expression and using (4.45a, b) we find 


ie (rf, cos O-F,)-Ar, a (fF, cos O-F,):Ar, 


A : : 
r, sin 6 r, sin @ 


(4.46) 
This expression holds for the bond angle between any three atoms. For the 
symmetrical triatomic molecule we have r, = r, = r, and it will be convenient 
to employ the half angle @ = 6/2, so we write (4.46) in the form 
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Rersin 2g = (Ff, cos 2 - F,)-(q, — q;) + (Ff, cos 26 — f,)-(q. — Gs) (4.47) 


we could invert (4.45a), (4.45b), and (4.47) to express the q, as functions of 
R,, and R, and R,, but there is a better approach. 

Experience has shown that internal vibrations are best described in terms of 
variables reflecting symmetries of the molecular structure. The water mol- 
ecule is symmetrical about an axis through the equilibrium position of the 
oxygen atom, as shown in Figure 4.1. Accordingly, we introduce symmetry 
coordinates S,, S,, S, for three independent sets of atomic displacements, as 
indicated in Figure 4.3. In terms of these variables, the atomic displacements 
are given by 


Ss, S; 


Fig. 4.3. Atomic displacements corresponding to symmetry coordinates (not normalized). 
q, = (S, — S, sin p)e, + (S, — S; cos d)a,, 
q, = (-S, — S, sin d)o, + (S, + S, cos d)e,, 


_ {2m , 2m 
q; = | Ae S, sin 6| a, + ( zi ie S, Jo. (4.48) 


These expressions were constructed by assigning unit displacements to par- 
ticles 1 and 2 (for each variable) and then obtaining q, from (4.42). 


We express the kinetic energy in terms of symmetry coordinates by insert- 
ing (4.48) into (4.41); 


2K =m,,52 + mS + mass. (4.49a) 
where 
m,, = 2m,, Mm, = 2am,,m,, = 2bm,, (4.49b) 


with 
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m= 1+ 2, b=14+2 sin ¢. (4.50) 
The stretching and bending variables are expressed in terms of the symmetry 
coordinates by inserting (4.48) into (4.45a,b) and (4.47), with the results 


R, =-S, sin @-S,acos ¢ + bS,, (4.51a) 
R, =-S, sin ¢-S,acos ¢-bS,, (4.51b) 
rR, = —2S, cos @ + 2S, sin @. (4.51c) 


Inserting this into (4.44), we get the following expression for the potential 
energy in terms of symmetry coordinates: 


ZV = KeSi a2 5 5k S: ik, (4.52) 


where 


k,, = 2k, sin? @ + “~ cos’ ~ 


k,, = 2rcos p sin (k - ee | 
le 2a k cos’ @ + ae sin? d 
k,, = 2b*k,. (4.53) 


The introduction of symmetry coordinates has simplified the kinetic and 
potential energy functions so the normal frequencies can easily be calculated 
by the brute force method. In this case, the characteristic determinant (4.20) 
has the form 


ky, - wm, k, 0 
ko k,-wm, 0 = 0. (4.54) 
] 0 Lge = wms; 


This factors immediately into the two equations 
k,,;-w’m,, = 0, (4.55) 
MyM, — (My ky. + Myk,,)0* + kik» = 0. (4.56) 
The latter equation can be put in the factored form (w* — wi) (w’ - 3) = Oif 


w? and w% are its roots; so, by comparison of coefficients, we can express the 
roots of (4.55) by the equations 
My ko + Mk, es k,,k 


ot + o2 = uke Mak gf gi = Ske | (4.57) 
m,,M,, m,,mM,>, 
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Using (4.49b), (4.50) and (4.53) to evaluate the right sides of these equations 
in terms of molecular parameters, we obtain 


m 
——— 


2 


cos’ @ use 


1 


+e | 1+ = sin’ | a (4.58) 
ojot = 2 (1 +22 L) te (4.59) 
mer? 


From (4.55) we get an independent expression for the other normal frequency 


oi =( 1+ 2 — sin? sin? g) A (4.60) 


m, 


Experimentally determined values for the normal frequencies of the water 
molecule are 


O, _ -1 @, 
a 3654 cm" , ae 


= 1595 cm", a = 3756 cm", 


where c is the speed of light. Substitution of these values into (4.59) and 
(4.60), yields the following values for the force constants: 


k, = 7.76 x 10° dyne cm", fa. = 0.69 x 10° dyne cm". 


Comparison with the remaining equation (4.61) shows a two percent discrep- 
ancy, which is attributed to anharmonic forces. Note that the effective bending 
constant k,/r’ is only about ten percent as large as the stretching constant k, , 
indicating that bonds are easier to bend than stretch. 

To complete the determination of normal modes, we need to find the 
matrix [a] relating symmetry coordinates | S) to normal coordinates |Q) 
by |S) = [a]|Q). From (4.54) it is evident that [a] has the form 


Qa, a1, 0 
[a] = Ja,, hs 0 (4.61) 
0 as 


This shows that S, is already a normal coordinate differing from Q, only by a 
normalization factor, which we can read off directly from the expression 
(4.49) for kinetic energy; 


a,, = mi? = (2bm,)’”. (4.62) 


The brute force method gives us the ratios 
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Ara _ w2m,, r= ke 
an, k,, 

for a = 1, 2. For the water molecule the ratios have the values 
Qi, 


= 12 08. 


ay, 22 


This information is sufficient for us to construct a diagrammatic represen- 
tation of the displacements in the normal modes from Figure 4.3. The result is 
shown in Figure 4.4. The displacement vector for the oxygen atom in the 
figure should actually be reduced by a factor 2m,/m, = 1/8 to satisfy the 
center of mass constraint (4.42), and, of course, the displacements are 
exaggerated compared to the scale of interatomic distances. 


@> a 
Fig. 4.4. Normal modes of the water molecule. 


Damped Oscillations 


For a harmonic system with linear damping, the equation of motion has the 
form 


[m]|q) + [u]ig) + [k]lq) = 0. (4.64) 


This is the matrix version of the system of coupled linear differential equa- 
tions 


2 (MGs + Hass + Kapp) = 9. (4.65) 


The matrix [u] is symmetric and can be derived from the assumption of linear 
damping for each of the particles, as shown in Example 2.2 of Section 6-2. In 
general, the matrix elements 4,, = “.,(g) depend on the configuration, but in 
the linear approximation we use the constant values w,, = M,,(0), Just as we 
have done with the mass matrix elements m,,. 
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Equation (4.64) can be solved by the brute force method. One substitutes a 
trial solution of the form 


lq) = la)e" 
into the equation to get 

(-Q?[m] + iQ(u] + [k])]a) = 0. (4.66) 
This has nontrivial solution only if 

det (-Q*m,, + iQu.,g + kag) = 0. (4.67) 


This determinant is a polynomial of degree 2n in @2, with 2n (complex) roots 
Q.,. If the roots are all distinct, then (4.66) yields 2n corresponding 
solutions |a,), and the general solution of (4.64) is a superposition of the 2n 
orthogonal solutions 


la aes. (4.68) 


If the characteristic determinant (4.67) is m-fold degenerate in a root 2, then 
a trial solution of the form 


\q) a (|a,) ar |a,)t _ la. yes 1 gia 


will work. This generalizes the case of the critically damped harmonic oscil- 
lator discussed in Section 3-8. 

The brute force method always works, but there are simpler methods 
exploiting special symmetries of the coefficient matricies. In some cases there 
is a change of variables which simultaneously diagonalizes all three matrices 
[m], [u] and [k]. Thus, if 


[u] = y[m], (4.69) 


then the change of variables |g) = [a]|Q), which we found in the undamped 
case, puts the equation motion in the form 


|Q) + y|Q) + [w7]|Q) = 0, (4.70) 


where [w*] is the diagonal matrix (4.34). So each coordinate Q, separately 
satisfies the equation for a.damped harmonic oscillator: 


OG: at yO. 18 w.0, aa 0: (4.71) 
This has the complex solution 

OPN Cre a. (4.72) 
where 

Q, = (w2-7y’)” +4iy. (4.73) 


For light damping (w; >> y’); the ‘‘physical part” of the solution can there- 
fore be written 
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(Qo = Cae” cos(w_t oF yy). - (4.74) 


This shows that all the normal modes are equally damped despite the 
differences in frequency. 


Example 4.2 


For the damped double pendulum discussed in Example 2.2, the results of 
Example 2.2 enable us to write the linearized equation of motion 


[m}|Q) + [u]lp) + [k]lb) = 0, 


where 


(a, FL (ul, 
[u] = 
wll, pels 


and al] the other matrices are as given in Example 4.1. Comparison of [| with 
[m] in Example 4.1 shows that [wu] = ym] if m,=m,=m and p, = 
uu, = ym. Then the method just discussed can be applied (Exercise 4.10). 


Forced Oscillations 


When the th particle in a harmonic system is subject to an external sinusoidal 
force f, sin wt, according to (2.14), the system is subject to a generalized force 
with components 


F_sin wt = 2f;(4,,x;) sin ot. 
! 


Accordingly, Lagrange’s equation yields the equations of motion 


2 (Mads + Poss + Kase) = Fe, (4.75) 


where the generalized force is taken to be complex for convenience, and the 
Fare constant in the first order approximation. In matrix form, 


[m]|q) + [u]la) + [A]lq) = |e" 
This generalizes the Equation (3~9.4) for a forced harmonic oscillator. 
As in the harmonic oscillator case, the general solution of (4.76) consists of 
a particular solution plus a solution of the homogeneous equation (4.70) 
determined by the initial conditions. Since the homogeneous solution has 
been discussed, and it can be ignored in the presence of a steady driving force 
because it decays exponentially to zero, we can concentrate on the particular 
solution. 
If the matrices [m], [u], [Kk] can be diagonalized simultaneously by the 


(4.76) 
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change of variables |g) = [a]|Q), we may choose [a] in the form (4.40), to put 
the equation of motion (4.76) in the form 


10) + [yllQ) + [wJ}Q) = |e. (4.77) 
where 

If) = [a}"|F). (4.78) 
The components of (4.77) obey the independent equations of motion 

O,+ yO, + 020, = fet". (4.79) 


This is the equation for the harmonic oscillator solved in Section 3-9. In the 
present case, the particular solution can be written 


Ve eilot — da) 


0.) = (ome) + ae” (4.80a) 
where 
tand, = te . (4.80b) 


2 — w° 


Thus, the normal modes are excited independently of one another, and the 
excitation of each normal mode depends on the amplitude f, of the general- 
ized force as well as the driving frequency w. Excitation is a maximum when 
the driving frequency is equal to one of the resonant frequencies 


War = (Wz - 7 Ya)". (4.81) 


Molecules and crystals driven by electromagnetic waves absorb energy at such 
resonant frequencies. 

The general case when the equation of motion (4.76) cannot be decom- 
posed into independently excited normal modes will not be discussed here. 
But it should be mentioned that the general case is also characterized by 
multiple resonant frequencies. 


4-1. Exercises 


(4.1) Complete the solution of Example 4.1 to determine the normal 
modes |a,) and normal frequencies w,, when m, = 2m, = 2m, |, = 
31, 1, = 2/1. Determine the normal coordinates Q, and Q, as 
functions of the angles ¢, and @,. Specificy initial conditions which 
will excite each of the normal modes. 

(4.2) Show that in terms of normal coordinates the energy of a harmonic 
system is given by 


2E = 2(Q? + w.Q) = (Q|Q) + (Q|[w*}/Q). 
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(4.3) 


(4.4) 


(4.5) 


(4.6) 


Three identical plane pendulums are suspended from a slightly 

yielding support, so their motions are coupled. (Figure 4.5). To 

simplify the mathematical description, adopt a system of units in 

which the masses, lengths, and weights of the pendulums are equal 

to unity. The linearized potential energy for the system has the form 
N N 

N 


V7, 


Fig. 4.5. Coupled pendulums. 


2V = op? a p? ate 3 — 2ko,o, — 2ko,o, — 2ko,9 . 


Determine the normal frequencies and modes of oscillation. The 
system has degenerate modes, but show that an orthogonal set of 
normal coordinates |Q) can be chosen so that |@) = [a]/Q), where 


Ve. i Vo) 

ee a 
la] =e NS 1 os 
Oy =) vee 


Illustrate the normal modes with diagrams. Find an alternative set 
of normal modes, and explain how they differ physically from the 
first set. 

Show that R, = —25, and derive general expressions for the normal 
frequencies of a bent triatomic molecule assuming a_ potential 
energy of the form (4.44). Evaluate the consistency of the results 
with empirical data for the water molecule, and compare with the 
results in the text. 

Apply the results of the text to linear triatomic molecules by taking 
go = 90°. Sketch the normal modes. One of the modes is doubly 
degenerate, allowing circular displacements of the atoms without a 
net angular momentum of the atom as a whole. Sketch the circular 
modes. 

The experimentally determined normal frequencies of the CO, 
molecule are 


398 Many-Particle Systems 


QW, = -l @, = -l 
a 1337 cm . = 667 cm", pied 


Os = 2349 cm". 


Use the results of the preceding exercise to evaluate the force 
constants k, and k,/r*. Then check for self-consistency. 

(4.7) Compute the normal frequencies for the two one-dimensional nor- 
mal modes of symmetric linear triatomic molecules such as CO, and 
H,S by assuming a potential energy of form 


2V = K{(q, - 43)’ + (q2 - @,)’]. 


Compare with the results of the preceding exercise for CO,. 

(4.8) Suppose the two particles in Figure 3.1 have different masses. 
Determine the normal modes and frequencies of the system as 
explicit functions of the physical parameters. 

(4.9) For the two identical particles in Figure 3.1, determine the general 
solution to their equations of motion if they are subject to linear 
damping forces. 

(4.10) Complete Example 4.2 by finding the general solution when /, = /,. 

(4.11) Determine the natural frequencies of the coupled pendulums in 
Exercise (2.5) in Section 6-2. 


6-5. The Newtonian Many Body Problem 


The many body problem in classical mechanics is this: For a system of N 
particles with known interactions, determine the evolution of the system from 
any given initial state. In the preceding section we studied one kind of many 
body problem appropriate for modeling elastic solids. In the Newrfonian many 
body problem, the interactions are described by Newton's “Universal Law of 
Gravitation”. Then the equations of motion for particles in the system nave 
the specific form 
mx=—G 2 oa ; (Ss) 
ji Ix; — x,|" 
fori, j = 1, 2,..., N. The problem is to characterize all solutions of this 
system of coupled nonlinear differential equations. This is a mathematical 
problem of such importance and difficulty that it has engaged the best efforts 
of some of the greatest mathematicians. To this day, a general solution has 
not been found even for the case of three bodies, and much of the progress 
has been in understanding what actually constitutes a solution when the 
solution cannot be expressed in terms of known functions. 
Up to the twentieth century, Newton’s Law of Gravitation was the best 
candidate for an exact force law, so the most important reason for studying 
the many body problem was to work out detailed implications of the law 
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which could be subjected to empirical test. Since the advent of Einstein’s 
General Theory of Relativity it has been clear that Newton’s law is not an 
exact description of the gravitational interaction in nature. The exact impli- 
cations of Newton’s law still provide baseline predictions from which to 
measure the small deviations predicted by Einstein’s theory and possible 
alternatives. However, the most sensitive tests for distinguishing alternative 
theories are in situations free of many body complications. 

Although the many body problem may no longer be so important as a test 
of fundamental gravitational theory, it is crucial to many applications in 
astronomy and spacecraft mechanics. Thus, an understanding of the many 
body problem is needed to trace the evolution of the solar system and answer 
such questions as, will any of the planets collide in the future? Have any 
collided in the past? Of course, the evolution of the solar system is greatly 
influenced by other factors such as the energy flux from the Sun and the 
dissipation of energy by tidal friction. But the many body dynamics is no less 
significant for that. 

In Chapter 4 the gravitational two body problem was solved completely by 
finding four first integrals (or constants) of motion: the center of mass 
velocity and initial position, the angular momentum and the eccentricity 
vector. It is natural, therefore, to attack the general N body problem by 
looking for new first integrals. Unfortunately, that has turned out to be futile. 

The system of equations (5.1) is of order 6N, since it consists of N vector 
equations of order two. According to the theory of differential equations, 
then, its general solution depends on 6N scalar parameters or, equivalently 
2N vector parameters, which amounts to specifying the initial position and 
velocity for each of the N particles. From the general many particle theory of 
Section 6-1, we know that, for an isolated system, the angular momentum and 
the center of mass momentum are constants of motion; from which it follows 
that the center of mass position is determined by its initial value. This gives us 
nine scalar (three vector) integrals of motion. In addition we know that the 
total energy is an integral of the motion, because the gravitational force is 
conservative. Generalizing work by Bruns and Poincaré, Painlevé (1897) 
proved that, besides these 10 integrals, the general Newtonian N body 
problem admits no other constants of motion which are algebraic functions of 
the positions and velocities of the particles or even integrals of such functions. 
There might be constants of motion expressible in terms of other variables, 
but none have been found. This leaves 6N— 10 variables to be determined by 
other means. For that no general method is known. That is where the many 
body problem gets difficult. 

For a given initial conditions the N body problem can be solved by direct 
numerical integration of the equations of motion (5-1). More practical in most 
situations is the perturbation method, which generates N body solutions by 
calculating deviations from two body solutions. It will be developed in 
Chapter 8. Though these methods generate specific solutions, they give us 


$000 Ee 


little insight into qualitative features of solutions in general. They cannot tell 
us, for example, what specific initial conditions produce periodic or nearly 
periodic orbits, or when small changes in the initial conditions produce wildly 
different orbits, or what initial conditions allow one of the bodies to escape to 
infinity. To answer such qualitative questions about orbits, the great French 
mathematician Henri Poincaré developed new mathematical methods in the 
last quarter of the nineteenth century which proved to be seeds for whole new 
branches of mathematics blossoming in the twentieth century; including 
algebraic topology, global analysis and dynamical systems theory. Although 
here we cannot go deeply into the qualitative theory of differential equations 
incorporating such methods, we will survey what has been learned about the 3 
body problem and add some observations about the generalization to N 
bodies. 


The General Three Body Problem 


The case of three bodies is not only the simplest unsolved many body 
problem, it is also the problem of greatest practical importance. Conse- 
quently, it is the most thoroughly studied and understood. 

The three body problem has been attacked most successfully by analyzing 
the possibilities for reducing it to the two body problem. Absorbing the 
gravitational constant G into the definition of mass, for 3 bodies the Equa- 
tions (5.1) can be written 


Pe eye X, — X, X, — X; 
"|X; kel Ix, xl 
ton oe (Sem) 
1S ee 
x, — X,| x, — x, | 
mo X,— X X, — X, 
X, — —™M, = 3 a 3 3 
X, x, | Ix, X, 


With the center of mass as origin, the position vectors are related by 
WGX, + 1X, +71,x, = 0. 33) 


This reduces the Equations (5.2) to a system or order 18 —- 6 = 12. By using 
the angular momentum and energy integrals, the order can be reduced to 
12-4 = 8. The order can be further reduced to 7 by eliminating time as a 
variable and then to 6 by a procedure called “‘the elimination of nodes’’. 
However, the actual reduction is messy and not conducive to insight, so we 
shall not carry it out. For the special case of orbits lying in a fixed plane, the 
order is reduced to 6 —- 2 = 4, but the problem is still formidable. 

The three body equations of motion have their most symmetrical form 
when expressed in terms of the relative position vectors s,, s,, s, defined by 
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5) =x, 
Sh = 
S, = Ay—X; - (5.4) 


These variables are related by 
oie =e (5:5) 
(Figure 5.1). Solving (5.3) and (5.4) for the x,, we get 


m mx, = M.S, — M.S, 


3 


mx, = MS, — MS, 
mx, = m,s,—m,S,, (5.6) 
where 
ll (Si aL, i a 
(5) 


We need (5.6) to relate a 
solution in terms of the 
1 s my, symmetrical variables s, to 
Fig. 5.1. Position vectors for the three body problem. the fixed center of mass. 

By substitution of (5.4) 
into (5.2), we get equations of motion in the symmetrical form 


s 
s, = -m— + mG 
1 
£ S 
8, = -m— + m,G 
S> 
" S, 
S = ni + mG (5.8) 


3 


where s, = |s,|, and 


ea ete O=% (5.9) 

5S) S23 iS 
The noteworthy simplicity of this formulation was first pointed out by Broucke 
and Lass in 1973. Evidently it had been overlooked in two centuries of 
research on the three body problem. We see below that it provides a direct 
route to the known exact solutions of the three body problem. It has yet to be 
exploited in the analysis of more difficult questions. 
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The Triangular Solutions 


Note that system of equations (5.8) decouples into a set of three similar two 
body equations if G = 0. Comparing (5.5) and (5.9), we see that this will 
occur if 


s=si= si, (5.10) 


that is, if the particles are located at the vertices of an equilateral triangle. 
Then we can express two ‘‘sides”’ of the triangle in an” of, the third by 


See =. | + 93s. 7, oe Cae ya Sa 
s, = s,e7"? =-4+(1+iV3 )s,, Gay) 


where i is the unit bivector for the plane of the triangle. The triangular, 
relation will be maintained if i is constant, so s, and s, are determined by s, at 
all times. This is consistent with G = 0 in (5.8). Thus, we have found a family 
of solutions which reduce to solutions of the two body Kepler problem. As 
expressed by (5.11), the three particles remain at the vertices of an equilateral 
triangle, but the triangle may change its size and orientation in the plane as 
they move. The equilateral triangle solution was discovered by Lagrange. 
Note that it is completely independent of the particle masses. 

To describe the particle orbits with respect to the center of mass, we 
substitute (5.11) into (5.6) and find 


xe +[-2m, —m, + im,V3 |s, 
+([2m, se ideas ignnvel Cs 
mx, = >[m,-m,-i(m, + m,)V3 Js; . (5.12) 


ll 


mx, 


This shows that the particles follow similar two body orbits differing only in 
size and orientation determined by the masses. It should be noted that the 
acceleration vector of each particle points towards the center of mass (Exer- 
cise 5.1). Orbits for elliptical motion are shown in Figure 5.2. Similar orbits 
for hyperbolic and parabolic motions (Section 4-3) are easily constructed. 


The Collinear Solutions 


Euler found another exact solution of the three body equations where all 
particles lie on a line separated by distances in fixed ratio. To ascertain the 
general conditions for such a solution, suppose that particle 2 lies between the 
other two particles. Then the condition (5.5) is satisfied by writing 


s,=As,, s,=—-(1+A)s, G13) 


where A is a positive scalar to be determined. Now, we can eliminate G in the 
equations (5.8) to get 
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m, m, 


Fig. 5.2. Lagrange’s equilateral triangle solution for masses in the ratio m,:mj:m, = 1:2:3. 


r Sie Rls s 

§, + m— = (a, ar mS), 
1 Ms, S; 

: Seis | s 

8, + m—= = (3, + m) (5.14) 
S mM, 83 


Inserting (4.13) in these equations to eliminate s, and s,, we obtain 


(m, + m,(1 + A))8, =m, + m(1 + AY?) 
(m, — m,A)8, = -(m, — m,A°) = . (5.15) 
Consequently, 
m, ca mA m, — mA” . 


Putting this in standard form, we see that it is a fifth degree polynomial: 


(m, + m,)A> + (3m, + 2m,)A* + (3m, + m,)A* - A?(m, + 3m) - 


—A(2m, + 3m,)—(m, + m,) = 0. (5.16) 


The left side is negative when A = 0 and positive as A + ; therefore the 
polynomial has a positive real root. By Descartes’ rule of signs, it has no more 
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than one positive real root. Therefore A is a unique function of the masses, 
and the two body solutions of (5.15) determine a family of collinear three 
body solutions. Two other solutions are obtained by putting different particles 
between the others. Thus, there are three distinct families of collinear three 
body solutions. A solution for the elliptic case is shown in Figure 5.3. 


M1, 


Fig. 5.3. Euler’s collinear solution for masses in the ratio m,:m,:m, = 1:2:3. 


Generalizations of the Lagrange and Euler solutions for systems with more 
than three particles have been found. As in the three particle case, in all these 
solutions the particles remain in a permanent configuration with accelerations 
directed towards the center of mass. The solutions are mainly of mathemati- 
cal interest, since they require extremely special initial conditions to be 
realized physically. 

The Euler and Lagrange solutions are the only known exact solutions of the 
three body problem. To get an understanding of the great range of other 
solutions, we turn to a qualitative analysis. 


Classification of Solutions 


A systematic classification and analysis of three body solutions based on 
energy and asymptotic behavior was initiated by Chazy in 1922 and refined by 
others since. Table 5.1 gives a current form of the classification using termi- 
nology suggested by Szebehely. 

The three body system can be described as a pair of two body systems, one 
consisting of two particles, the second consisting of the third particle and the 
center of mass of the other two. The motion of each two body system can be 
described as elliptic, parabolic or hyperbolic, depending on the system’s 
energy, and this description becomes exact asymptotically (that is, in the limit 
t— ~%) if the separation of the third body from the others increases with time 
t when t is sufficiently large. The permissible asymptotic states of the two 
body system depend on the sign of the total energy of the three body system. 

When the total energy E is positive the classification is easy, because the 
energy of at least one two body subsystem must be positive, so its asymptotic 
motion must be hyperbolic and the two body systems separate. There are 
three possibilities, as shown in Table 5.1. In hyperbolic explosion all three 
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TABLE 5.1. Classification of Three Body Solutions 


is Sy) 
Explosion hyperbolic 
hyperbolic-parabolic 
Escape hyperbolic-elliptic 
E=0 
Escape hyperbolic-elliptic 
Explosion parabolic 
Eel 
Escape hyperbolic-elliptic 


parabolic-elliptic 
Bounded motion 

Interplay 

Ejection 

Revolution 

Equilibrium 

Periodic 
Oscillatory motion 


particles depart along hyperbolas. In hyperbolic-parabolic explosion two 
particles depart along parabolas while the third departs along a hyperbola. In 
the hyperbolic-elliptic case, one particle escapes along a hyperbola while the 
others form a binary, that is, a two particle system bound in elliptic motion. 

The classification for zero total energy is similar to the positive energy case. 
Of course, in all cases the parabolic motions are unlikely to be realized 
physically, because they require a specific value for the energy. 

When the total energy is negative, the classification is more complex. As in 
the positive energy case, the possibility exists that one particle escapes leaving 
a binary behind. However, there are also many types of bounded motion. The 
term interplay refers to motions with repeated close approaches among the 
particles. Ejection refers to motions where two particles form a binary, while 
the third is repeatedly ejected on large nearly elliptical orbits, much as comets 
are ejected from the solar system. Revolution is the case where the orbit of 
the third body surrounds a binary. We have discussed the equiltbrium solu- 
tions of Lagrange and Euler already. As we show below in a special case, 
these solutions are unstable unless there are large differences in the masses. 
The periodic orbits can be of any type of bounded motion just mentioned. 

The oscillatory motion listed in Table 5.1 was not discovered until 1960. It 
consists of a binary and a third particle which moves along a line perpendicu- 
lar to the orbital plane of the binary and through its center of mass. The 
oscillating particle goes to infinity along this line while its velocity goes to zero 
in finite time, and this behavior is repeated as time goes to infinity. Obviously, 
this case is not of physical interest. 
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The classification gives us a general picture of the possible motions, but it is 
not specific enough to answer the most important questions, such as, which 
type of motion will ensue from given initial conditions with negative total 
energy. However, it has been determined that for arbitrary initial conditions 
hyperbolic-elliptic escape is the most likely. Interplay is a necessary prelude 
to ejection, and repeated ejections may lead to escape. Usually, the particle 
which escapes has the smallest mass. 


The Restricted Three Body Problem 


The three body problem can be viewed as a perturbation by the third particle 
of the two body motion of the other two particles, called primaries. For this 
purpose, it is convenient to employ Jacobi coordinates x and r. The vector x is 
the position vector of the third particle with respect to the center of mass of 
the primaries, located at 

Tips ar Mee Sts 


m, +m, m,+m,, 


Therefore, 
m 
———— X; 5 (S37) 
u 
where m = m, + m, + m,andu = m, + m.,. The relative position vector for 
the primaries is 
r= X,-X,. Gals) 


In terms of the Jacobi coordinates, the relative positions of the third particle 
with respect to the primaries are 

r, =x,-xX,=x+my'r 
r,=xX,—-xX, =X — mypr. (5.19) 


In terms of the Jacobi coordinates, the three body equations of motion (5.2) 
take the form of an equation for the relative motion of the primaries 


pa-eh + m| 2-H), (5.20) 


coupled to an equation for the motion of the third body with respect to the 
primaries 
Se (5.21) 
mT wor; 
Here, r, andr, are to be regarded as auxiliary variables defined in terms of the 
Jacobi variables by (5.19). 
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Jacobi coordinates are most appropriate when the mass of the third body is 
much less than the mass of either primary. When the mass m, is so small that 
its influence on the primaries can be neglected, we can write m = u 
=m, + m, and Equations (5.20) and (5.21) reduce to 


i (5.22) 
Se eee 
x=-m, zi ms: (5.23) 


The problem of solving these equations is called restricted three body prob- 
lem. 

We already know the general solution of the two body equation (5.22) for 
the primaries. However, the general solution of (5.23) is still unknown, 
though many special solutions have been found. Thus, the restricted. three 
body problem is still an open area for research. 

Most three body research has concentrated on the circular restricted prob- 
lem, which 1s restricted to circular solutions of the primary two body equation 
(5.22). Besides the helpful mathematical simplifications, this special case has 
important practical applications. It is a good model, for instance, of the 
Sun-Earth-Moon system, or of a spacecraft travelling between the Earth and 
Moon. So let us examine this case in more detail. 

To that end, it is convenient to make a slight change of notation in (5.22) 
and (5.23), writing 


= (5.24) 


Er r, 


Sine (5.25) 


x’ =-™m, 


The circular solution to (5.24) with angular frequency @ is 


r'=RtrR=rR’, (5.26) 
where 
R = elie (5.27) 


and r is a fixed vector, the relative position vector of the primaries in the 
rotating system. The motion of the primaries is most easily accounted for in 
(5.25) by transforming it to the rotating system in which the primaries are at 
rest. Accordingly, we write 


x’ = RXR, r,' = Ritr,R (5.28) 
and substitute into (5.25) to get the equation of motion in the rotating system: 


% + 2@ X x = F(x) (5.29) 


408 Many-Particle Systems 


where 


ie, (5.30) 
ry r; 


F(x) = — @ X(@ X x) -™m, 


is an “effective force” with r, and r, given by (5.19). Thus, the circular 
restricted problem has been reduced to solving this equation. Before looking 
for specific solutions, we ascertain some general characteristics of the equa- 
tion. 
We note that the “effective force’ F(x) is a conservative force with 
potential 
UG = — Norm? = Se (5.31) 


rT, r, 


that is, 
==VU. (5.32) 


This can be proved by using properties of the gradient operator V = V, 
established in Section 2-8 to carry out the differentiation of (5.31). Using 
V(@'x) = w and Vx = 2x, we find that the gradient of the “centrifugal 
pseudopotential” 


-+(@ X x)? = +(@ Ax)? = +[(w-x)?- w°x’] (3033) 
is the “centrifugal pseudoforce’”’ 
—w X (w X x) = w:(WAX) = wx-(@-x). (5.34) 


The last two terms in (5.30) are obtained from the last two terms of (5.31) by 
using the chain rule and Vr, = r;,/r;,. 

Multiplying the equation of motion (5.29) by x and using (5.32), we easily 
prove that 


=o ia (5.35) 


known as Jacobi's integral, is a constant of the motion. Of course, this is just 
the energy integral for the three body system with the contributions of the 
primaries removed, as allowed by the approximation in the restricted prob- 
lem. 

Before studying particular solutions of the restricted problem, we can 
simplify computations by choosing a unit of length so that 


wan (5.36a) 
a unit of time so that 
w=, (5.36b) 


and a unit of mass so that 
zh=m,+m,=1. (5.36c) 
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Defining a mass difference parameter y by 


m,—mM,= Wy, - (5.36d) 
we have 
m,=7(1+y), m=7(1-y), (5.36e) 


where y is positive for m, > m,. 
Equilibrium Points 


A point x, at which the force function F(x) vanishes is called an equilibrium 
point (stationary point or libration point) of the differential equation (5.29). A 
particle initially at rest at an equilibrium point will remain at rest, because its 
acceleration vanishes. The equilibrium points are “critical points’ of the 
potential U(x), for, from (5.32) we see that 


a: VU(xo) = -a-F(x,) = 0, 


that is, at an equilibrium point x, the directional derivative of the potential 
vanishes in every direction a. 
The equilibrium points are solutions of the equation 


F(X) = Xo — e("X,) — m, —— m, > = 0, (5.37) 
where, by (5.3) and (5.3c), 
=X vr, 1, > xX, 7 mr. (5:38) 


Dotting (5.37) and (5.38) 
with w, we determine that 
or, = wr, = wx, = V. 
Therefore, all equilibrium 
poins lie in the orbital 
plane of the primaries 
(called the primary plane). 
The solutions of (5.37) 
are just restricted versions 
of the exact solutions of Lag- 
range and Euler which we 
found for the general three 
body problem. Therefore, 
since the primaries have a 
fixed separation, there are 
two eqiulibrium points L,, 
L, at the vertices of equila- L, 
teral triangles with r; =r; Fig. 5.4. The five Lagrange points of the restricted three 
=r’,and three equilibrium _ body problem. 
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points L,, L,, L, collinear with the positions of the primaries (Figure 5.4). 
The five equilibrium points L,, L,, L,, L,, L, are called Lagrange points. 
The Lagrange points are of more than mere academic interest. Groups of 
asteroids known as the ‘“Trojans” are found near the points L, and L, of the 
Sun-Jupiter system. The Lagrange point L, of the Earth-Moon system has 
been suggested as a suitable place for a space colony. Reflected light from 
asteroids temporarily trapped near the point L, of the Sun-Earth system may 
be responsible for a faintly glowing spot in the night sky called the gegenschein. 


Stability of the Lagrange Points 


An equilibrium point is said to be stable if a particle stays near it when 
subjected to small disturbances. To investigate the stability at x,. we deter- 
mine how the force function F(x) varies with small displacement ¢ from x,. A 
Taylor expansion (Section 2.8) gives us 


F(x. 6) =F) Fe Vey. ss, (5.39) 


where higher order terms can be neglected. Introducing the notation 
F’(e) = e-VF(x,) and differentiating (4.30) we obtain 


F'(e) = w X (w X )-(™ + me) e+ 3 [mrt mre 
ia ae r; r5 

(5.40) 
where r, andr, are given by (5.38). With F(x,) = 0, substitution of (5.39) and 
x(t) = x, + e(f) into the equation of motion (5.19) yields the variational 


equation 
é+20Xé=F'(e). Gap) 


This is the linearized equation for motion near an equilibrium point. We say 
“linearized”, because F'(e) is the linear approximation to the force F(x, + ). 
The equation ts called a variational equation, because it describes deviations 
(variations) from a reference orbit, namely the circular orbit of an equilib- 
rium point in the primary plane. 

The theory of differential equations tells us that stability (instability) of the 
linearized equation is necessary for stability (instability) of the nonlinear 
equation it approximates. To prove that stability of the linearized equation is 
also sufficient for stability of the nonlinear equation is a difficult mathematical 
problem that has been solved only for particular cases, too difficult to broach 
here. We must be content with the results of a linear analysis, a study of 
solutions to the linearized equation (5.41). Stability at the various Lagrange 
points must be examined separately. But first it is advisable to ascertain the 
qualitative features of the variational equation which contribute to stability. 

For the displacement component €, = @-é along the direction @ normal to 
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the primary plane, by dotting (5.40) and (5.41) with @ we obtain the equation 


= - (74+ e. (5.42) 


mo PD 
Since the coefficient is positive, e, is limited to small harmonic oscillations. 
Therefore, the equilibrium points are stable with respect to displacements 
normal to the primary plane. So, for the rest of our analysis we limit out 
attention to motions within the primary plane. 
From (5.20) we see that F’(e) is a symmetric linear function. Moreover, for 
displacements in the primary plane, 


hg | - as |: + (Zr) | + ( = (r,-8)"). (5.43) 
r; Yi A 

We shall see that the first coefficient vanishes when evaluated at the Lagrange 

points L, and L. so e-F'(e) > 0 for e # 0. This tells us that F’(e) is a repulsive 

force increasing linearly with distance. We can express this in another way by 

inserting (5.32) into (5.43) to get 


(e-V)? U(x,) = -e-F’(e) <0, (5.44) 


which holds for any direction é in the primary plane. This tells us that the 
effective potential U(x) is a maximum at the equilibrium points. In other 
words, the Lagrange points L, and L, are peaks of potential hills in the 
primary plane. Nevertheless, L, and L, are stable equilibrium points if a 
certain condition on the masses of the primaries is met. 

To see how stability is possible on a potential hill, consider a particle at rest 
which has been nudged off the peak. The repulsive force accelerates it 
downhill. As its velocity increases, according to (5.41) the coriolis force 
increases, and the particle is deflected to the right. If the deflection is 
sufficient, the particle starts back uphill and slows down until it starts to fall 
down again, repeating the process. Thus, the Coriolis force can bind a particle 
to a “pseudopotential” maximum, producing stability. The rightward deflec- 
tion of the coriolis force is opposite in sense to the rotation of the primaries, 
which we express by saying at the particle’s motion is retrograde. 

To see if the conditions for stability are met at the Lagrange points, we 
must examine the solutions of the variational equation (5.41) quantitatively. 
At the triangular Lagrange points L, and L., r, =r, = r° = 1, so (5.40) 
reduces to a strictly repulsive force 


F'(e) = 3(m,r,rye + m,rr,e) => (e+ m rer, + m,r,er,). (5.45) 
At L,, the sides of the triangle are related by 

r, =re™ =r>(1+iV3 ) 

r,=re?=r+(-1+iv3 ), (5.46) 


where i = i@ is the unit bivector for the primary plane. Note that w-e = (0) 
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implies ie = —ei, so substitution of (5.46) and (5.36) into (5.45) leads to the 
following explicit form for the linearized force: 
F’(e) = ye+ rer (-1+ iyV3 ). (5.47) 


This can be simplified by expressing F’ in terms of its eigenvectors. That is 
most easily done by writing 


pS e,e'?, (5.48) 


where e, is an eigenvector which can be obtained from r when @ is known. 
Now, 


rer = e,ce,e"*, 

and this will simplify the last term in (5.47) if 
¢(-1 + iyV3) = pe’. 

This implies that 


tan 2o = y V3, (5.49) 
so we can put (5.47) in the general form 

F'(e) = ae + Be ee, , (5.50) 
with 

a== and p=—-(1 +37)" (5.51) 


in this particular case. From (5.50) 

F’(e,) = (a + Be, 

F'(e.) = (a — Be, (5.52) 
showing that e, and e, = e,i are eigenvectors of F’ with eigenvalues a + f. It 
is worth noting that we have illustrated here a new general method for solving 
the eigenvalue problem in two dimensions, which has some advantages over 


the methods developed in Section 5.2. 
We use (5.50) to put (5.41) in the form 


é + 2éi = ae + Pe,ee,. (5-53) 
It is convenient to reformulate this as an equation for the spinor 

Z=€8 (5.54) 
which relates ¢ to e, by e = e,Z. Thus, we obtain 

Z + 2iZ = aZ+ BZ. (5.55) 


Our stability problem has been reduced to studying the solutions of this 
equation. 
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From our experience with linear differential equations in Chapter 3, we 
know that (5.55) has circular solutions of the form Z = a exp(idr) if B = 0. 
However, when f # 0 the solutions cannot be of this form, because the 
conjugation Z* changes the sign of the exponential. This suggests that we 
consider a trial solution of the form 


ZL =a +e" . (5.56) 


where a and 6 are complex coetficients to be determined by the initial 
conditions. The parameter A must be real (i.e. scalar) for a stable solution, for 
if it has a finite imaginary part one of the exponential factors will grow 
without bound. 

Substituting (5.56) into (5.53) and separately equating coefficients of the 
different exponential factors, we obtain 


a(A* + 2A + a) = -Bbt, 

b(A? - 2A+a) =—-fa'. (3:57) 
Assuming A = A‘, we eliminate a and b from these equations to get 

Geaeha)y —4SR. 


This is a quadratic equation for A° with the solution 


A, = (2-a) + (B’-4a+ 4)". (5.58) 
For the particular values of @ and f in (5.51), this becomes 
M=s Selly 2). (5.59) 
Both roots will be real if and only if 
m,-m,; 23 i - 
= = >| = = 0.922958, 5.60 
Y= m, +m, | 27 et) 


and both real roots will be positive since y° < 1. Thus we have found a 
condition on the masses of the primaries necessary for stability at L,. 

For the Sun-Jupiter system, the mass ratio is about 1000:1, so y = 0.999. 
For the Earth-Moon system, the mass ratio 1s about 81.4:1 so y = 0.977. 
These values satisfy the inequality (5.46), so the Lagrange points L, and L, 
are stable for both systems. 

For a real and positive root A determined by (5.59), we know from our 
study of the harmonic oscillator that the solution (5.51) describes an ellipse. 
However, (5.56) is not the form for a general solution as it is in the harmonic 
oscillator case, for the coefficients a@ and 6 are not mutually independent, 
being related by (5.57). It is of some interest, therefore, to determine initial 
conditions which produce such a special solution. We choose initial time in 
(5.56) so that a is real and positive. Then (5.57) and (5.51) tell us that 
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b Meee 


= = Ol. 5.61 

a —p [(A? ate a) = 4a°]'" ( ) 
Using (5.54) and (5.56), we can write the solution in the form 

e =e,[(a + b) cos Att i(a—b) sin Ar). (5.62) 


This shows us that the major axis of the elliptical orbit must be aligned with 
the principal axis e, of the force function F’. Then (5.61) implies that the 
coefficient (a — b) is negative, which tells us that the orbit is retrograde. For 
an initial position x, on the principal axis, the constant (a + b) is determined 
by 


&,=e,(a + b). (5.63a) 
The initial velocity is then determined by 

: : _..|a-b 

é = e,i(a— b)A = xi ( er ) (S.63b) 


where the constant (a — b)/(a + b) is determined by (5.61). More generally, 
Equation (5.62) determines a unique orbit, through any specified initial 
position. 

According to (5.59), two unique positive values for A are allowed, say A, 
and A, with A, > A,. For each of these, there is a special solution of the form 
(5.62). So through any given point there pass exactly two retrograde elliptical 
orbits, an orbit with large angular frequency A, and one with small frequency 
A,. Every allowed motion is a superposition of these two, that is to say, the 
general solution of the variational equation (5.53) has the form 


2 = e,(a,e"" + be" + a,e" + b,e%), (5.64) 


where the a; determine the 5; by (5.57), and a,, a, can be regarded as two 
independent complex coefficients determined by the initial conditions. 

Stability of the collinear Lagrange points L,, L,, L, can be investigated in 
the same way as L,. The linearized force is found to have the same general 
form (5.50) with e, = f, but the values of a and f differ from those in (5.51). 
They produce roots of opposite sign from (5.58). The positive root character- 
izes a retrograde elliptical orbit just asin the L., case. However, the negative 
root implies that A is imaginary, and this leads to an exponentially divergent 
solution when inserted in (5.56). Thus, special initial conditions produce 
bounded elliptical motion at the collinear Lagrange points, but these equilib- 
rium points must be regarded as unstable, because the general solution (of the 
form (5.58)) is divergent. In a real physical situation where an object is 
trapped in an elliptical orbit at one of the collinear points, external disturb- 
ances will eventually deflect it to an “unbounded orbit” so it escapes. 

A global perspective on possible orbits with a given energy can be gained 
from a contour map of the effective potential, which has the form 
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Fig. 5.5. Contour map of the Earth-Moon potential in the synodic (rotating) reference system. 


Utx) = -tx? - _ & (5.65) 
r r, 

in the primary plane. At large distances the first term dominates, so the 
potential decreases with x° and the equipotential curves are nearly circular. 
Each mass is centered in a potential well, so nearby equipotentials are circles 
around it. The Lagrange points L, and L. are potential maxima, as we have 
shown. These are the critical features of the contour map in Figure 5.5. The 
map shows the three collinear Lagrange points as saddle points. This can be 
verified analytically by showing that, at the collinear points, 


(e:-V)?U =—e-F'(e) <0 (5.66a) 
for 8 = r, and 

(e-V)YU>0 (5.66b) 
for é = ri. 


The energy integral (5.35) implies that a particle with total energy C cannot 
cross a contour determined by the equation U(x) = C; its motion is confined 
to regions where U < C. Thus, a particle trapped at L, will circle the peak but 
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Fig. 5.6. Earth-Moon “bus route” in the synodic (rotating) reference system (found by 
Arenstorf (1963)). 


never climb higher than U = C. It is worth noting that if the particle's kinetic 
energy is dissipated by collisions with gas, dust or small bodies, then it will 
slide down the peak, increasing the amplitude of its oscillations about L,. 
This has been suggested as the origin of the large amplitude oscillations of the 
Trojan asteroids. 

The regions excluded by energy conservation are called Hill's regions by 
astronomers, after G. W. Hill who pointed out that the stability of the Moon’s 
orbit is assured by the fact that it lies within its bounding contour which 
encircles the Earth. The contour map is a helpful guide in the search for 
periodic solutions, to which we now turn. 


Periodic Solutions 


The systematic search for periodic solutions of the circular restricted three 
body problem was inaugurated by Poincaré in a masterful series of mathe- 
matical studies. He conjectured that every bounded solution is arbitrarily 
close to some periodic solution, possibly with very long period. This reduces 
the classification problem to classification of period solutions. Periodic solu- 
tions are more easily classified because the behavior of a periodic function for 
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Fig. 5.7. Earth-Moon “‘bus route”’ in the sidereal (non-rotating) reference system (Arenstorf 


(1963)). 


all time is known, when its behavior for a finite time (the period) has been 
determined. 

Another reason for studying periodic solutions is to use them as reference 
orbits for calculations which account for other physical effects by perturbation 
theory (Chapter 8). A prime example is the Hill-Brown lunar theory. With 
the Sun and Earth as primaries, Hill (1877) found a periodic solution within 
the Earth’s potential well (Figure 5.5) known as Hill’s variational curve. The 
curve is an oval, symmetrical about the primary axis and elongated perpen- 
dicular to the axis. With this as a reference curve, a description of the Moon's 
motion with high precision was developed by Hill and Brown. Since 1923, 
Brown’s results have been used in preparing tables of lunar motion. 

A solution x = x(t; y, C) is periodic with period 7 if, for any time ¢, 


R(t Cox + 1: y, C): 


The solution depends parametrically on the mass and energy parameters y 
and C. When a particular periodic solution has been found, a whole family of 
periodic solutions can be generated from it by varying the parameters. Thus, 
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Hill’s variational curve generates a family of periodic orbits about the smaller 
primary. The shape of the curve varies as the parameters are changed; the oval 
develops unforseeable cusps and loops; they can be found only by numerical 
calculation. Poincaré called this family of orbits solutions of the first kind. 

Solutions of the second kind lie in the primary plane and loop around each 
primary. The existence of periodic orbits which pass arbitrarily close to each 
primary was first proved by Arenstorf (1963), and many such orbits have 
since been calculated. A particularly good candidate for lunar bus route is 
shown in Figures 5.6 and 5.7. The buses would shuttle material and people 
between the Earth and Moon with minimal fuel consumption. 

Before the invention of the modern computer, the computation of three 
body orbits was long and laborious. An enormous number of periodic 
solutions, many with bizarre shares, have been found in the last 25 years. 
Interest in the 3-body problem has never been greater for both practical and 
mathematical reasons. It remains an open and active field for research. For 
more information the reader should consult the specialized literature. The 
most comprehensive account of 3-body research is by Szebehely (1967). 


6-5. Exercises 


el) Use Equation (5.12) to show that the orbit x, = x,(¢) of particle 1 
solves the equation 


(nm, Fm.m, +m)" x, 
mM, ne 

(5.2) Solve Equation (5.16) and determine the collinear solutions ex- 
plicitly when all three particles identical masses. 

(58) Verify the Jacobi equations of motion (5.20) and (5.41) and the 
following expressions for the total angular momentum I, kinetic 
energy K and “central moment of inertia” J = 2 IMXx" 


c= 


l= gr Xrt gx X x, 
2K= gy’ + gox°, 
2) = 29°, 
where g, = m,m,/u and g, = m,u/m. 
(5.4) Show that at the collinear Lagrange points L,, L,, L, the linearized 


equation for deviations in the primary plane has the form of equa- 
tion (5.50), where a = 1 + f/3 and 


3 {lla l-y 
—— - 
P a aa re 


Prove therefrom that the collinear Lagrange points are unstable. 


Chapter 7 


Rigid Body Mechanics 


Rigid Body Mechanics is a subtheory of classical mechanics, with its own 
body of concepts and theorems. It is mainly concerned with working out the 
consequences of rigidity assumptions in models of solid bodies. 

The formulation of rigid body theory in Section 7-1 is unique in its use of a 
spinor equation to describe rotational kinematics. This makes the whole 
spinor (quaternion), theory of rotations, with all its unique advantages, 
available for application to rigid body problems. Sections 7-3 and 7-4 present 
one of the most extensive mathematical treatments of spinning tops to be 
found anywhere, certainly the most extensive using spinor methods. Some of 
this material is likely to be difficult for the novice, but comparison with 
alternative approaches in the literature shows that it includes many simplifi- 
cations. Since this is the first extensive spinor treatment of classical rotational 
dynamics to be published, it can probably be improved, and it is wide open 
for new applications. In a subsequent book (NFII), we shall see that this 
approach is closely related to the quantum mechanical theory of spinning 
particles. 

The treatment of inertia tensors in Section 7-2 is intended to be complete 
and systematic enough to make it useful as a reference. 


7-1. Rigid Body Modeling 


This section is concerned with general principles and strategies for developing 
rigid body models of solid objects. We can distinguish three major stages in 
the development of a rigid body model. In the first stage, a suitable set of 
descriptive variables is determined to describe the structure and state of 
motion of the body as well its interactions with other objects. In the second 
stage, the descriptive variables are combined with laws of motion and interac- 
tion to determine definite equations of motion for the body. In the final stage 
the equations of motion are solved and their consequences are analyzed. In 
this section we will be concerned with the first two stages only, but we will 
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stop short of developing specific models. Specific rigid body models and their 
ramifications will be studied in subsequent sections. 

A complete set of descriptive variables and laws of motion for an arbitrary 
rigid body is listed in Table 1.1. The dynamical laws of rigid body motion 
were derived from the laws of particle mechanics in Section 6.1. Now, 
however, when we do rigid body mechanics we take the laws of rigid body 
motion as axioms, so no further appeal to particle mechanics is necessary. 
unless one wants to consider alternatives to the working assumption of 
rigidity. To apply the laws of motion, we need to understand and contro] the 
descriptive variables, so we approach that first. 


State Variables 


We designate the position and attitude of a rigid body respectively by a vector 
X and a unitary spinor R. The vector X designates the center of mass of the 
body. The spinor R relates the relative position r of each particle in the body 
to a relative position r’ in a fixed reference configuration; specifically, the 
spinor-valued function R = R(t) determines the time dependent rotation 


r=ArR Gel) 
TABLE 1.1 Descriptive Variables and Laws of Motion for a Rigid Body 


Translational Motion Rotational Motion 


Object and State Variables 


Mass m = Xm; Inertia Tensor Ju = Lmyjrjyxrjau 
f 
where r; = x; —X 


Position (vector) xe= 2 rmx; Attitude (spinor) R where R'R = | 
i 
VEE GING = ; as + 
elocity <= a Rotational Velocity = iw = 2R'R 
Momentum P = mx Angular momentum 1l= #%w@ (vector) 
L = il (bivector) 
Kinetic energy Ki, = +X-P Kinetic energy Keot = 4+ ol 
= mx? = + wo Iw 
Interaction Variables 
Force F = 2F; Torque f= exe, 
i 


Laws of Motion 


Newton’s law P =mX =F Euler's law i= %otox%o=T 
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so that 
r=oXr, CF) 


where w is the rotational velocity of the body as specified in Table 7.1. The 
position x of a particle in the body is given by 


x=X+r=X+R'r'R, (1.3) 


It follows that the motion of a rigid body can be described completely by 
specifying its position X = X(¢) and its attitude R = R(t) as functions of time, 
for the trajectory x = x(t) of each particle in the body is then determined by 
(1.3). The translational and rotational trajectories X = X(t) and R = R(é) 
resulting from the action of specified forces are determined by the laws of 
motion in Table 1.1 together with specified initial values for X and @ as well as 
X and R. At any time /, the state of translational motion is described by the 
center of mass position X(f) and velocity X(t), while the state of rotational 
motion is described by the attitude R(t) and rotational velocity w(t). Accord- 
ingly, the descriptive variables X, X, R, @ are called state variables or 
kinematic variables for the rigid body. 


Object Variables 


Object variables describe intrinsic physical properties taking on particular 
fixed values for each particular object. Values of the object variables for a 
composite object depend on its structure. The object variables for a rigid 
body are the mass m, inertia tensor / and the location of the center of mass X 
with respect to the body. Size and shape are also intrinsic properties of a rigid 
body, but they play no role in mgid body kinematics, so they need not be 
represented by object variables. However, the geometrical properties of size 
and shape play an important role in dynamics, since they determine the points 
at which contact forces can be applied. 

In modeling an object as a rigid body, the first problem is to determine the 
values of its object variables. Methods for solving this problem are discussed 
in Section 7-2. In our discussion here we take it for granted that the values of 
the object variables are known. We are interested here in general properties 
of the object variables. 

To interpret, analyze and solve Euler’s equation, we need to know the 
structure of the inertia tensor and its relation to the kinematic variables. We 
established in Section 6-1 that the general properties of the inertia tensor in 
Table 1.2 follow from its definition in Table 1.1. With these general proper- 
ties in hand, we need not refer to the detailed definition of the inertia tensor 
in our analysis of Euler’s equation. 

Since the inertia tensor is linear and symmetric (Table 1.2), we know from 
Section 5-2 that it has three orthonormal principal vectors e, (kK = 1, 2. 3) 
satisfying the eigenvalue equation 
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TABLE 1.2 Some General Properties of the Inertia Tensor . 
These properties hold for arbitrary vectors u and v and for an inertia tensor ¢ 
relative to any base point. 


Linear F(au + By) = a fut BP FV 

Symmetric u Jv =v Su 

Positive definite u: fu > Oifu #0 

Kinematic Ju=oX fut J(u X @) 

Fe, = Ie, (1.4) 


with principal values I, (k = 1, 2, 3). From the positive definite property 
(Table 1.2), it follows that each principal value is a positive number. The 
determination of principal vectors and values for specific bodies is carried out 
in Section 7-2, but for the purpose of analyzing rotational motion, it is only 
necessary to know that they exist. 

The principal vectors e, specify direc- Instantaneous rotation axis 
tions in fixed relation to the rigid body. 
It is convenient to imagine that the e, 
are rigidly attached to the body at the 
center of rotation (assumed to be the 
center of mass unless otherwise speci- 
fied). The lines through the center of 
rotation with directions e, are called 
principal axes of the body (Figure 1.1). 
Since the e, rotate with the body, ac- 
cording to (1.2) they obey the equation 
of motion 


€, = ow Xe,. ES) 
; , Fig. 1.1. Principal axes for an arbitrary 
According to (1.1), the solution to these body. 


equations can be given the form 


e; = Rto,R, (1.6) 


where {a,} is any standard frame of constant vectors. Since RRt = 1, it 
follows from (1.6) that 


€,€,€, = 6,6,6, = I, 


where the body frame {e,} has been chosen to be righthanded. 
Euler’s equation Ja + w X Jw = TI can be decomposed into its compo- 
nents with respect to the body frame, with the result 


Lort Gd, a I,)w,0, =T, 
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LOn + ci = I,)w,, =T, (a) 
1,0, + (L -I)o,o, = T, 


where w, = w-e, andl, = F-e,. Most of the literature on rigid body motion 
deals with Euler’s equation only in its component form (1.7). In contrast, we 
will develop techniques to handle Euler’s equation without breaking it up into 
components. This makes it easier to interpret results and visualize the motion 
as a whole, and it has certain mathematical advantages. It should be noted, 
however, that (1.7) does not give the components of Euler’s equation with 
respect to an arbitrary coordinate system. Rather (1.7) gives the components 
with respect to a special frame determined by the intrinsic structure of the 
body. So we should expect (1.7) to have some special advantages. Indeed, in 
Section 7-4 we shall see that (1.7) is most useful for treating the rotational 
motion of an asymmetric body. 

The vector w = 2RtR/i is the rotational velocity of the body relative to the 
nonrotating space frame. From the viewpoint of an observer on the spinning 
body, the body is at rest while the universe rotates around it with a rotational 
velocity w’ which differs from @ only in that it rotates with the body. More 
specifically, the rotational velocity with respect to the body frame is given by 


w' = RoRt =-2iRRt (1.8) 
note that 
Ww, = we, = (wRta,R>, = w'-a;,. (1.9) 


Consequently, Euler’s equations (1.7) can be regarded as equations for either 
w or w’. We will work mostly with @, but w’ will be needed when we wish to 
interpret observations with respect to a rotating frame such as the Earth. 


Change of Base Point 


The equations in Table 1.1 decompose the motion of a rigid body into 
translations and rotations which can be analyzed separately, although they 
may be coupled. This decomposition is achieved by choosing the center of 
mass X as a base point (or center of rotation) through which the rotation 
passes. Sometimes, however, the description of motion is simpler when 
referred to a different base point Y in the body. Let us examine this possi- 
bility. The chosen center of rotation Y can be designated by its directance. 


R=xX-Y (1.10) 
to the center of mass X. The angular momentum about this point 


1, = = m;,(x,- Y) X (x,- Y) = 2m, (r, +R) X (f; +R). (1.11) 
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But Sr, = mx, - X) = 0. so by expanding the right side of (1.11) we 
obtain 


lL, =l+mR XR. C2) 


Thus, I, is the sum of the body’s “intrinsic angular momentum” | with respect 
to the center of mass and its “orbital angular momentum” mR X R with 
respect to base point. Furthermore, from (1.2) it follows that s, = @ X s; for 
s, = x; — Y, so (1.11) yields 


l, = 4,0 = Lm; X (@Xs,) = Umssro . C13) 
I I 


This defines the inertia tensor 4%, which determines the angular momentum lk, 
as a function of the rotational velocity w. Since R = w X R, we can express 
(1.12) as a relation among inertia tensors 


I,0 = Fa + mMRRaq. (1.14) 


This important formula is called the parallel axis theorem, because it relates 
the rotational motion about an axis through the center of mass to the 
rotational motion about a parallel axis passing through another base point of 
the body (Figure 1.2). 

We can get an equation for the rotational motion about 
Y by substituting (1.12) into the rotational equation of 
motion from Table 1.1; thus, 


i=1,-mR XR=re=TI,-RXF, Gne)) 
where 
Ty = 2(x,-Y) X F, = 2(r, + R) X F, (1.16) 
is the torque about the reais point. Inserting 
P=mY+mR=F (1.17) 
into (1.15), we get the rotational equation of motion in 
the form {\ 
I, =I,g—mR XY. (1.18) 2 


The coupled equations (1.17) and (1.18) are useful in 
problems where Y = Y(‘) is a function specified by con- 
straints, in other words, when the solution of the trans- , ere cuy con be 
lational equation of motion is known. Most important is chosen as a center of 
the case when Y is a fixed point so Y = 0. Then (1.18) rotation, and the ro- 


Fig. 1.2. Any point Y 


reduces to tational velocity has the 
: ‘ : same value @ at all 
l= S2at ZaoeT =p. (1.19) _ points of the body. 


This is identical in form to Euler’s equation (Table 1.1) for motion about the 
mass center. The only difference is in the choice of inertial tensor according to 
(ise iyror (1.14). 
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To sum up, a change of base point from X to Y can be regarded as a change 
of translational state variables. Although this does not affect the rotational 
state variables R and a, it does induce changes in the inertia tensor and the 
equations of motion. 


Constants of Motion 


In rigid body mechanics, constants of motion arise from symmetries of the 
equations of motion, just as in particle mechanics. As in particle mechanics, 
constants of the translational motion derive from special properties of the 
applied forces. Similarly, constants of rotational motion derive from special 
properties of the applied torque. We derived the internal energy conservation 
law for a system of particles in Section 6-1. But if we are to carry out our 
program of developing rigid body mechanics independently of particle mech- 
anics, we must see how energy conservation for rotational motion can be 
derived from Euler’s equation. 

Using the kinematic property of the inertia tensor in Table 1.2, we find that 
«: Jo = 0, and with the symmetry property we have 


wl = w-(Fo + Ja)=a Io = (ro. Io). 
Therefore Euler’s equation gives 
4 ot) = oF (1.20) 
dt 


for the rate of change of the rotational kinetic energy ; wl = + ww. This 
is the most we can say about rotational energy without some specific assump- 
tion about the torque. 

In the important case of a single force F applied at a point r fixed in the 
rotating body, the torque is F = r X F, and 


ol=(oXr)-F=r-F. (21) 
For a conservative force with potential V = V(r), we know from Section 2-8 
that 


dv 


ca = PV =- "an 


(p22) 


Then, from (1.20) we get the rotational energy 
Evot = 7 @'1+ Vir) (1.23) 


as a constant of motion. For the important case of constant force, we use 
(1.21) directly to get 


E.,=+o: Iw —rF. (24) 


rot 
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Of course, it is possible for translational and rotational energies to be 
conserved together even when they are not conserved separately. 

Other constants of motion derive from the fact that the torque is a product 
of vectors. Thus, even for an arbitrary force we have 


rl=r-(r X F) = 0. 


If the force acts always on the same point in the body, then r = @ X r. and 
rl =(@Xr)-l=r-(i X @). Obviously r-l vanishes if | = /@: we shall see 
that it also vanishes for an axially symmetric body if r hes along the axis of 
symmetry. Therefore, in these cases r-} = d(r-1)/dt = 0, sor-lis a constant of 
the motion. 

Similarly, 


F-] = F-(r X F) = 0. 


Therefore, if F is constant, then F-1 is a constant of the motion. Thus, for the 
case of a constant force acting at a fixed point on the body, the quantities E,,,,,, 
rl and F-I are all constants of the rotational motion. By finding these three 
constants we have, in effect, integrated Euler’s equation to determine the 
rotational velocity w. This reduces the problem of determining the rotational 
motion to integrating R = + Riw for the attitude R. That is still a tricky 
problem, as we shall see in Sections 7-3 and 7-4. 


A Compact Formulation of the Rigid Body Laws 


The center of mass momentum P and “‘internal”’ angular momentum | of a 
rigid body can be combined into a single quantity P, defined by 


P=P4a. (1.25) 


Let's call this the complex momentum of the body. Similarly, a force F and 
torque I applied to a rigid body can be combined in a single quantity 


W=FH+ir (1.26) 


called a complex force or wrench on the body. With these definitions for P and 
W, the laws for rigid body translational and rotational motion in Table 1.1 can 
be combined in a single equation 


P= (1.27) 


which we might call the complex law of motion for a rigid body. 

Now the question is whether the motion law (1.27) has any physical 
meaning or value beyond the fact that it is the most compact formulation of 
the rigid body motion laws which we could hope for. The first thing to note is 
that the complex combination of momentum and angular momentum in 
(1.25) is geometrically correct since, as we have noted before, the angular 
momentum is most properly represented geometrically as a bivector L = i. 
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Thus, P is more than a mere formal combination of ‘‘real vectors” P and | with 
a “unit imaginary” /; it is a combination of physically distinct vector and 
bivector quantities. Similar remarks apply to the complex force, since the 
torque is most properly represented geometrically as a bivector iT. 

The “‘complex vector” P represents the state of rigid body motion in the 
6-dimensional space of vectors plus bivectors. The dimension of this space is 
exactly right, because a rigid body has six degrees of freedom. Moreover, the 
partition of this space into the 3-dimensional subspaces of vectors and 
bivectors corresponds exactly to the partition of a rigid motion into transla- 
tional and rotational motions. Thus, the description of rigid motion in this 
space by the law (1.27) makes sense physically. We shall see below that the 
‘complex formulation” has a deeper physical meaning and some mathemati- 
cal advantages. Since this compact formulation of rigid body theory with 
geometric algebra has not been published previously, the full extent of its 
usefulness remains to be determined. In the meantime, we always have the 
option of treating translational and rotational parts separately as usual. 

Of course, our use of a complex momentum calls for a complex velocity V 
defined by 


V=X+io. (1.28) 


In working with complex vectors, it is convenient to introduce a scalar 
product defined by 


P*V = (PIV). (1.29) 


Using (1.25) and (1.28), then, we find that the rotal kinetic energy K of a rigid 
body is given by the simple expression 


K=1 P*V=1+(P-X+4+ 10). (1.30) 


And from the motion law (1.27), we find that the change in kinetic energy 1s 
determined by the equation 


K=W*V=F-X+TI-o. (1.31) 


One would expect this to be most useful in problems where rotational and 
translational motions are coupled, as in rolling motion. 


Equipollence and Reduction of Force Systems 


A single force F applied to a rigid body at a point r in the body frame, exerts a 
torque [ = r X F. Therefore, the full effect of the force on the body is 
represented by the ‘‘complex interaction variable” 


W=FrirXF=F+raF. (1.32a) 


In Section 2-6 we saw that any vector F and its moment raF determine a 
unique oriented line with directance d = (rAF)F ' from the origin (= center 


of mass here). For this reason, a wrench of the form (1.32a) is sometimes 
called a line vector, and it can be written in the equivalent form 


W=F+dF=(1+d)F. (1.32b) 


The oriented line is called the axode and d is called the moment arm of F or W 
(Figure 1.3). 

The equivalence of (1.32b) with 
(1.32a) means that F would have 
the same effect on the body if it 
were applied at the point d instead 
of at r or, indeed, at any other point 
on the axode. Forces applied at 
different points of a rigid body are 
said to be equipollent if and only if 
their line vectors (wrenches) are 
equal. This implies that two forces 
are equipollent if and only if they 
have the same axode and magni- 
tude. 

The concept of equipollence is readily generalized to systems of forces. 
Suppose that forces F,, F,,..., F,, are applied simultaneously to rigid body 
at points r,, r,..... ©, respectively. Each applied force F, determines a 
wrench on the body 


W,=F,+ir,XF,=F,+4F,, (1.33) 


Fig. 1.3. The axode of a force F. 


where d, is the moment arm to the ith axode. The net effect on the body, of 
the entire system of forces, is determined by the superposition of wrenchs 
producing the resultant wrench 


W=2,=F £ir, (1.34) 
1 


where F = =F, 1s the resultant force and [ = =r, X F, is the resultant torque. 
This is the superposition principle tor rigid body mechanics. According to the 
motion law (1.27), the entire effect of a system of forces on a rigid body 1s 
determined by its resultant wrench. Therefore, two different force systems will 
have the same effect on rigid body motion if and only if their resultant wrenches 
are equal. In that case, we say that the force systems are equipollent. 

We can use the fact that equipollent force systems have identical effects on 
rigid body motion to simplify our models by replacing a given system of forces 
by a simpler system equipollent to it. This is called reduction of forces (or 
wrenches). To facilitate force reduction, we now develop a few general 
theorems. 

First we note that equipollence relations are independent of base point. To 
prove it, we recall that a change in base point from the center of mass induces 
a change in the resultant torque I of a system of forces (given by (1.15)) while 
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the resultant force F is unchanged. Therefore, a shift of base point changes a 
wrench W = F + iI to 


W, = W- RaF. (1.35) 


Since F is the same for equipollent force systems, their wrenches will be 
changed by the same amount and so remain equipollent. 

The base point independence of equipollence relations is especially impor- 
tant in rigid body statics, the study of force systems that maintain mechanical 
equilibrium. A body is said to be in mechanical equilibrium if the resultant 
applied wrench vanishes, that is, if the system of applied forces is equipollent 
to zero. In a typical statics problem, one of the forces and/or its point of 
application is unknown and must be computed from the other known forces. 
If W, = F, + dF, is the wrench of a unknown force F, and W, is the resultant 
wrench of the known forces, then the equation for mechanical equilibrium 
can be written 


W,+W,=0, (1.36) 


which is trivially solved for F, and its moment arm d,. Evidently every statics 
problem is essentially a problem in wrench reduction. The main trick in such 
problems is to choose the base point to simplify the reduction. For example, 
the torque reduction is often simplest if the origin is chosen at the intersection 
of concurrent axodes. Of course, the best choice of origin depends on the 
given information. 


Parallel Forces 


For a system of parallel forces F,, we can write F, = F,u, where u is a unit 
vector and F; = u-F,. The resultant wrench of the system is then 


W=)> Fu + (> Fadj)u. Ci3t) 
l l 
If the resultant force 


F=> F, => Fu 
t 1 


does not vanish, then we can write (1.37) in the form 


W =F + dF, 
where 
> Fid; 
d = ——_—_.. (1.38) 
2, 


Thus, we have proved that any system of parallel forces with nonvanishing 
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resultant F is equipollent to a single force F with moment arm d given by 
(1.38). 

For a uniform gravitational field acting on the body, we have F; = m,g 
where m; is the mass of the ith particle, so (1.38) becomes 


> m;d; 
i 


ee . 
1 


(1.39) 


This is an expression for the directance of the center of mass from a line with 
direction u passing through the origin (= base point). Of course, d vanishes if 
the origin is the center of mass as we have been assuming. But our argument 
shows that the result (1.39) must hold for any chosen base point. 

For the case of two parallel forces, (1.38) is simply 


_ Fd, + Fd, 


d 
ape e 


(1.40) 
This should be recognized as the expression for a “point of division” dis- 
cussed in Section 2-6. Many other geometrical results in that section are 
useful in the analysis of force systems. Note that (1.40) immediately gives us 
the elementary “law of the lever’’; it tells us that the effects of parallel forces 
applied at d, and d, can be exactly cancelled by a single force applied to the 
point of division d. 


Couples 


A pair of equal and opposite forces applied to a rigid body is called an applied 
couple. The wrench C for a couple of forces F, and F, applied at points r, and 
r, is (Figure 1.4) 


C= hr+ Frank) er Aro PF AL —rArL.. 
Thus, the wrench of a couple can be written 
C=(r,-r,)AF,=dF,, (1.41) 


where d is the directance between the axodes of the couple. Clearly, the 
resultant force of a couple is always zero and its resultant torque is zero if 
d = 0. 

Since F=F,+F,=0 for a 
couple, it follows from (1.35) that 
the torque exerted by a couple is 
independent of. the base point. As 
(1.41) shows, the torque of a couple 
depends on the force F, and the 
directance d. Therefore, equipol- 
lent couples can be obtained by 


Fig. 1.4. A couple and its wrench C. 
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changing the magnitudes of F, and d in inverse proportion or by displacing the 
point at which F, is applied to any desired point as long as d is kept fixed. 

Any system of applied forces with a vanishing resultant force can be reduced 
to a couple, that is, the system is equipollent to a couple. For, in general, such 
a system has a nonvanishing torque I’, so the wrench of the system has the 
form W = if. To find an equipollent couple, we pick any force F, orthogonal 
to T and write 


ir = dF.,. 
This determines the directance of the couple: 
iTAF F, Xx 
d= Si se I ; 
iF; F2 F 


As a corollary to this result, we can conclude that any system of couples is 
equipollent to a single couple. 


Reduction to a Force and Couple 


We are now in position to ascertain the most general reduction theorem. The 
wrench of any system of forces has the form W = F + if. A single force F 
acting at the base point produces no torque, and we have proved that a couple 
producing a given torque I can always be found. Therefore, any system of 
forces can be reduced to a couple and a single force acting at the base point. 
However, this reduction is not an equipollence relation, because it depends 
on the choice of base point. 

To find the simplest equipollence reduction, we decompose F into compo- 
nents’, and I, respectively orthogonal and parallel to F. Define a moment 
arm d for the force F by 


doar Lo (1.42) 
F? 
so that 
F+i, =F +dF. (1.43) 


This is a line vector describing the action of a single applied force. Now 
choose a couple with torque I, and note that we can write 


EAA y- (1.44) 
where 
i a 
= ——., 1.45 
hes (1.45) 


When the torque of a couple is collinear with a force F as expressed by (1.44) 
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and (1.45), we say that the couple is parallel to the force with pitch h. 
Now we add (1.43) and (1.44) to get 


W=F+iF=F+(d+ih)F=+i(dXF+hF). (1.46) 


The right side of this equation is the wrench for a single force F applied 
together with a parallel couple. A shift of base point will change the torque 
d X F, but it will not change the body point at which the force F is applied or 
the torque of the couple. Since the left side of (1.46) may be regarded as the 
resultant wrench of an arbitrary system of 
forces, we have proved that every system of 
forces is equipollent to a single force and par- 
allel couple. This includes the limiting cases 
h = 0 and h = ». When h = 0 the wrench 
reduces to a line vector, so the force system is 
equipollent to a single force without a couple. 
We may assume that the case h = © de- i 

scribes a pure couple, obtained by taking the Fig. 1.5. A “wrench” generates a 
limit h > © as|F | > Oinsucha way that the screw motion along its axode. 
product /|F| in (1.44) remains finite. 

A force system consisting of a single applied force and a parallel couple is 
called a ‘“‘wrench”’ in older literature on mechanics (Figure 1.5), whereas we 
have used the term wrench (without quotes) for the complex force of any 
force system. Our definition is more practical, since we then have frequent 
occasion to use the term, whereas an actual applied ‘twrench” is very rare 
physically. It is most convenient to use the same term wrench for either sense, 
since the two senses can be distinguished from the context, and they are 
intimately related. Then our major conclusion in the preceding paragraph can 
be expressed more succinctly as: any system of forces can be reduced to a 
wrench, For similar reasons, it is convenient to use the term couple for any 
applied torque which can be produced by a couple of forces, even if it is 
actually produced by a larger number of forces. 

To appreciate the aptness of the term ‘“‘wrench’’, consider a wrench applied 
to a rigid body at rest. The force F will produce an acceleration along its 
axode, whereas the couple AF will generate a rotation about the axode. The 
instantaneous composite motion is therefore an instantaneous screw displace- 
ment. It is analogous to the motion of a physical screw turned by a physical 
wrench. 


Concurrent Forces 


One other kind of force system is of general interest. A system of forces is 
said to be concurrent if the axodes of all the forces pass through a common 
point. Let r be the position vector of this point with respect to the body center 
of mass. Then the wrench of the system is 
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w= > F, AF > raF, . 
k k 


Therefore, 
W=F+raF=F+dF, (1.47) 


where d is the moment arm of the resultant force F. Thus, we have proved 
that a system of concurrent forces is equipollent to a single force with axode 
passing through the intersection of their axodes. The case of parallel forces 
with F # 0 can be regarded as the limit of this case as r > ©. 

The example two concurrent contact forces is illustrated in Figure 1.6. A 
more significant example is the case of a central force field acting on all the 
particles of the body, in particular, a gravitational field. According to (1.47), 
the effect of the entire field is equivalent to the effect of a single force F 
applied at the point d. In the gravitational case, this point is called the center 
of gravity. In general, the 
center of gravity differs from 
the center of mass, except in 
the limiting case of a uni- 
form gravitational field. This 
means that a body in a non- 
uniform gravitational field 
experiences a torque about 
its center of mass. For exam- 
ple, the nonuniform gravi- 
tational fields of the Sun and 
Fig. 1.6. Two concurrent forces are equipollent to a the Moon exert a torque on 
single force F with moment arm d. the Earth. We shall investi- 

gate that in Chapter 8. 


7-1. Exercises 


@e Derive the general properties of the inertia tensor in Table 1.2 
directly from its definition in Table 1.1. 
(1.2) Derive the component form (1.7) for Euler’s equation. 


(1.3) Prove that the scalar product defined by (1.29) is symmetric and 
positive definite. 
(1.4) Show that W*V is invariant under a shift of base point, that 1s 
FX +Pr-o=FY+T,-o 


Show that P*V is not invariant in the same sense. What is the 
meaning of this? 
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7-2. Rigid Body Structure 


The kinematics of rigid body motion depends on only a few intrinsic proper- 
ties of the body, namely, its mass m, center of Mass X and inertia tensor #. In 
this section we develop methods for determining X and ¥ from the distribu- 
tion of mass in a given body. 

So far we have regarded rigid bodies as composed of a finite number of 
point particles. However, our results can easily be generalized to describe 
continuous bodies by standard techniques of integral calculus. Thus, a continu- 
ous body can be subdivided into N parts, where the k-th part contains a point 
x,, has volume AV, and mass Am,. The mass of the body is then 


Ae 


In the limit of infinite subdivision as N > & and AV, — 0, the sum becomes 
an integral: 


= [an -[e dV, (2.2) 


where dV is the element of volume and 0 = 0e(x) is the mass density at each 
point x of the body. Let it be understood that the integral in (2.2) is to be 
taken over all points of the body and further that the integral reduces to asum 
for any part of the body composed of point particles. 


Determining the center of mass 


The distribution of mass in a body is described by the mass density 9 = 0(x). 
It determines the center of mass X defined by 


= =F fam = = | aver (2.3) 


This is the generalization of our earlier definition to continuous bodies. 

The center of mass for a given body can always be calculated by performing 
the integral in (2.3). However, the calculation can often be simplified by using 
one of the general theorems which we now proceed to establish. In the first 
place, we are not much interested in the center of mass X relative to an 
arbitrary origin, because this can be given any value whatever merely by a 
shift of origin. Rather we are interested in the center of mass relative to some 
easily identifiable point Y of the body. Accordingly, let us write 


ip = Se Ve 


for the position of a particle relative to Y, so 
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mx = | dime = [ dime + v fam. 


Thus, the center of mass relative to our specially chosen origin Y is given by 
1 1 
R = X- Y = — | dmr = — | dVor, Oe 
| | or (2.4) 


where now we write 9 = oe(r). Equation (2.4) is mathematically identical to 
(2.3), but it differs in the physical assumption that the origin need not be a 
fixed point in an inertial frame. Thus, for the purpose of calculating the center 
of mass, we are free to choose any convenient point in the body as origin 
without considering motion of the body. Once the center of mass has been 
identified, its motion can be determined from the equation for translational 
motion. 


Symmetry Principles for the Center of Mass 


The center of mass can often be identified from symmetries of the body. 
There are three major types of symmetry: reflection, rotation and inversion. 
A body with “reflection symmetry” or ‘‘mirror symmetry” is symmetrical 
with respect to reflection in a plane. We can describe this with the mathemati- 
cal formulation of reflections developed in Section 5-3. Let n be a unit normal 
to the symmetry plane and select some point in the plane as origin. The 
symmetry of the body is then described by the condition that the mass density 
is invariant under the reflection 


r—>r’ =-nrn, (2.5a) 
that is, 
o(r) = o(-nrn) = ofr’). (2.5b) 


This is illustrated in Figure 2.1. Now, from (2.4) and (2.5b) we obtain 
—nRn = a [ em(e)-aem 
m 
‘i 1)! 
= mf ameee =R. 
m 


Therefore, 


nR + Rn = 2n R= O07 


telling us that R lies in the sym- 
Fig. 2.1. A body with a single symmetry plane. metry plane. Thus we have proved 
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the theorem: /f a body has a plane of symmetry, then the center of mass is 
located in that plane. This theorem has some obvious corollaries: /f a body has 
two distinct symmetry planes, then the center of mass is located on their line of 
intersection. If a body has three symmetry planes which intersect at a single 
point, then that point is the center of mass. 

Rotational symmetry tells us more about the center of mass than a single 
reflection symmetry. A body is said to have a symmetry axis if it is symmetri- 
cal with respect to a nontrivial rotation about that axis. The adjective 
“nontrivial” here is meant to exclude the “trivial symmetries” under rotations 
by integer multiples of 27 which every body possesses. To describe a rota- 
tional symmetry mathematically, let us choose an origin on the axis of 
symmetry so a rotation <5 about this axis can be written 


ror’ =dSr=StrS, (2.6) 


where S is a unitary spinor. This rotation is a symmetry of the body if it leaves 
the mass density invariant: 


e(r) = e(dr) = ofr’). (2.7) 
Applying this to the center of mass vector (2.4), we obtain 


1 1 
SR=—]d Sr = — |dm(r’)r’ = 
LI m(r) Sr x [ames R, 


which tells us that the center of mass is invariant under «5. It follows (from 
5-3.20) that the center of mass must lie on the axis of symmetry. Thus, we 
have proved the theorem: Jf a body has an axis of symmetry, then its center of 
mass lies on that axis. As an obvious corollary we have: /f a body has two 
distinct symmetry axes, then the axes intersect at a point which is the center of 
mass. 

A homogeneous plane lamina in the shape of a parallelogram, as shown in 
Figure 2.2, is invariant under a rotation by 7 about an axis along n through its 
center. It is obviously invariant under reflections in its lane as well. There- 
fore, its center of mass is located at the point 


R= a +b) 


if a corner is chosen as origin. This 
choice has the advantage of relating 
R to the vectors a and b in Figure 
2.2 which characterize the shape of 
the body. If a-b = 0, the parallelo- 
gram reduces to a rectangle and the Jeg, Boe, (ive symmetry axis of a parallelogram 
body has additional symmetries; P#5S¢S through its center. 

specifically, it is symmetric with respect to reflections in the two planes 
passing through R with normals a and b. However, the location of the center 
of mass is unaffected by this additional symmetry. 
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A body ts said to have inversion symmetry if the origin can be chosen so that 


o(r) = o(-r), (2.8) 
that is, the mass density is invariant under the inversion r > — r. For a body 
with this kind of symmetry 


mR = | amieye = -|{ dm(-r)(-r) = — mR, 


so R = 0. Thus, we have proved that if a body has inversion symmetry, then 
the center of inversion is the center of mass. 

A homogeneous parallelopiped with intersecting edges which are not 
perpendicular provides an example of a body with inversion symmetry with- 
out any rotation or reflection symmetries. 

It is a well-established convention that the phrase ‘‘a symmetry of the 
body” refers to invariance of the body under an orthogonal transformation of 
the three kinds (reflections, rotations, inversions) already discussed or any 
combination of them. But there is another common kind of symmetry which 
we have already mentioned without giving a proper definition, namely, the 
symmetry of a continuous body when all of its component particles are 
physically identical. A body with this kind of symmetry is said to be homoge- 
neous. The mass density 0 of a homogeneous body has the same value at each 
point of the body. It follows that 


m=0[av=o¥, (2.9) 


that is, the mass m of the body is directly proportional to its volume V. Then 
from (2.4) it follows that 


Re aaa (2.10) 


so the location of the center of mass R is determined by the geometry of the 
body alone. In this case, the center of mass is called the centroid. 


The Additivity Principle for the Center of Mass 


Besides the symmetry principles just mentioned, the most useful general 
principle for determining the center of mass 1s additivity: If a body is com- 
posed of N bodies with known masses m, and mass centers R,, then the mass 
center R of the composite body can be obtained by treating the parts as 
particles, that is, 


mem, RK, Hm, Re. . tmyR,, (2711) 


where, of course, m = m, +m, +... + my. This is an elementary conse- 
quence of the additivity of the integrals (2.4) and (2.7) 
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Example 2.1. Centroid of a Triangular Lamina 


The additivity principle applies also to a continuous subdivision of a body. 
For example, let us calculate the centroid of the triangular lamina shown in 
Figure 2.3. We subdivide the triangle into narrow strips of width dA parallel to 
one side. By symmetry, the centroid of a strip is at its center 


Ri = 5 (a +b) = day, (2.12) 


where A is a scalar parameter in the range 0 < A S 1. The directed area of the 
strip is 


dA, =4[(A + da)aJa[(A + da)b]-+(Aa)A(Ab) = 2A dA A, (213) 
where A = +aab is the directed area of the triangle. The mass of the strip is 

dm, = (7 dA, = 2mA dd, (2.14) 
where 


dA, = id A,| and A = |Al. 


So the centroid of the entire triangle 1s 
ae) | oar) = a (2.15) 
ithe {EW 3 : 


The vector a, specifies a median of the 
triangle (the line segment from a vertex 
to the midpoint of the opposite side), so 
(2.15) says that the centroid is located 
on a median two thirds of the distance 
from a vertex. This must be true for all 
medians, so we can conclude that the 
three medians of a triangle intersect at a 
common point, the centroid of the tri- 
angle. This result is so simple that one 
suspects it can be obtained without in- 
tegration. Indeed it can! Regard the 
triangle in Figure 2.3 as half a parallelo- 
gram with edges a and b. We know by 
symmetry that the centroid of the paral- 
lelogram is a,. By additivity, a, must lie 
on a line connecting the centroids of the 
two identical triangles. Therefore the Fig 2.3. 
centroid of each triangle must lie on lamina. 


Subdivision of a plane triangular 
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the median contained in the diagonal of the parallelogram. Since the centroid 
lies On One median, it must lie at the intersection of all three medians, as we 
found before by calculation. Unfortunately, our symmetry argument does not 
give the factor 2/3 in (2.15). 


Example 2.2. Centroid of a 
Hemisphere 


As another application of the addi- 
tivity principle, let us calculate the 
centroid of a hemisphere. As indi- 
cated in Figure 2.4, we can sub- 
divide the hemisphere into thin disks 
with centers on the axis of sym- 
metry. A disk of thickness dz and 
centroid za has mass om(a> — z°) dz. 
So the mass of the hemisphere is 


Fig. 2.4. Subdivision of a hemisphere into a i 
disks. m=on]| (a’-2°)dz=}474a’0, 
0 


and the centroid of the hemisphere is 


Ve mm | ome 2) dz za =2a. (2.16) 
m so 


The centroid of any solid of revolution can be found in a similar way. Such a 
body can always be subdivided into disks so its centroid is given by 


[‘r@ dz za 
R = =4t)W_—___ (2.17) 


i FZ) 2 


where r’(z) is the square of the radius of the disk with centroid za from an 
origin located at the intersection of the axis of symmetry with the base of the 
body. 


Calculating the Inertia Tensor 
The inertia tensor for a system of particles 


fu = 27,1, 0, AU 
k 


generalizes to 


Fu =| am rraAu (2.18) 


for a continuous body. Although (2.18) holds for any chosen origin, it will be 
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most convenient to assume that the origin for (2.18) is the center of mass 
unless an alternative is explicitly specified. 


Example 2.3. Inertia tensor of a uniform rod. 


The inertia tensor of a homogeneous rod can be calculated directly from 
(2.18) without difficulty. By symmetry the centroid of the rod 1s at its center, 
so we take that as the origin. Let the rod have directed length 2a and 
negligible thickness so we can regard it as a continuous line of mass points 
with direction specified by a vector a. Then r = Aa designates the points of 
the rod when the values of A are in the range - 1 < A < 1. In this case the 
volume integral (2.18) for the inertia tensor reduces to a line integral. Thus, 


_ ib as 
Ju = [am rrAu [ (a an) Fi |(ia)Gia)an 


m ! 
=— aaau AZ da. 
2 | 


Therefore 


m 


Ju=s 


m 

aaAu igs a X (u X a) (2.19) 
is the inertia tensor for a homogeneous rod of length 2\a|. From the parallel 
axis theorem (1.8) we find that the inertia tensor ¥, with respect to either end 
of the rod is 


Fu = = aaAu + maaAu = a aanu. (2.20) 


Note that (2.19) is an explicit representation of the inertia tensor for a rod 
in terms of a vector a which directly represents the length and alignment of 
the rod. Similarly, for more complicated bodies we aim to evaluate the 
integral (2.18) to represent the inertia tensor for a body explicitly in terms of 
vectors describing prominent geometrical features of the body. But before 
attempting to integrate (2.18) for a complex body, it will be worthwhile to see 
how the problem can be simplified by exploiting the principles of symmetry 
and additivity which we used to simplify center of mass calculations. 


Symmetry Principles for Inertia Tensors. 


Let us first investigate what symmetries of a body tell about its inertia tensor. 
We have seen that a symmetry 5 of a body is best defined as an orthogonal 
transformation which takes each point r to a point r’ = Sr and leaves the 
mass density 0 invariant, that is, 
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o(r’) = o(Sr) = o(r). (2.21a) 


In Section 5-3 we proved that any orthogonal transformation can be written in 
the explicit form 


Po =or= 250s. (2.21b) 


where S'S = 1 and the plus (minus) sign is used if S is an even (odd) 
multivector. In particular, (2.21b) is identical to the reflection (2.5a) if S = n 
or to the inversion r’ = - rif § =i. 

To determine the relation of «S to the inertia tensor, we can use (2.21b) to 
obtain 


r’'r’au = +S'rS(S'rSu — uS'rS) 
= +S'r(rSuSt — SuS'r)S = St(rra(SuS*))S. 
So, because of (2.19b). 


Iu -| dm(r)rrau | dm(r’)r'r’ au 


= St[ | dm(r)rra(SuS*)]S 
= SASS Ss = SF, 
where <S"'u = + SuS‘. Thus we have the operator formula 
SIS* = F, 
or, equivalently, 
SI = IS. (2222) 


This may be regarded as the precise mathematical formulation of the state- 
ment that ‘‘the inertia tensor of a body is invariant under every symmetry of the 
body’. It is not true, however, that every orthogonal transformation which 
commutes with the inertia tensor is a symmetry of the body. 

Equation (2.22) relates symmetries to principal vectors and principal values. 
If a is a principal vector of ¥ with principal value A, then we write 


Ja = Aa. (2.23) 
From (2.22) it follows that 

ISa = Sa = 5S(Aa) = Ada, 
that is, 

Ja’ = Aa’ if a’ = Sa. 


Thus, any symmetry-related vector a' = Sa of a principle vector a is also a 
principle vector with the same principle value. A number of corollaries follow 
easily for the various kinds of symmetry. 
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(1) The normal of any symmetry plane is a principal vector. 
(2) The axis of a nontrivial rotation symmetry is a principal axis. 
(3) If there is a nontrivial rotation symmetry through an angle less than n, 
then all vectors orthogonal to the rotation axis are principal vectors with the 
same principal value. 
It should be noted that inversion symmetry tells us nothing about the inertia 
tensor, though it does tell us the location of the center of mass. 


The Additivity Principle for Inertia Tensors 


For direct calculation of an inertia tensor, additivity is the most important 
general principle. To formulate and derive this principle, it is convenient to 
designate a given body by the set % of mass points which make it up. Now if a 
body 7% is subdivided into N bodies 4,, 4,,..., By, it follows from the 
definition (2.18) that its inertial tensor is subject to the corresponding subdivi- 
sion; 


| dm rrau -| dm rrau +| dmrraut+... +{ dm rrau, 
7s By 7h J3 


AN 


where the integrals are over the indicated bodies (sets of particles). This 
relation can be expressed in operator form 


=F + Ft AS, (2.24) 


describing the inertia tensor #' of a body as a sum of inertia tensors £', of its 
parts. The primes serve as a reminder that the inertia tensors in (2.24) are 
generally not referred to the mass center of the component bodies. 

It is essential to realize that the additivity relation (2.24) holds only for 
inertia tensors referred to a common origin, so the parallel .axis theorem (1.8) 
is needed to exploit it. Thus, if a body 4% is composed of N parts with known 
masses m, , mass centers R, and inertia tensors ¥, , then its inertia tensor can 
be found by the following steps. First, the inertia tensors of the parts must be 
referred to the origin by the parallel axis theorem (1.8), so we write 


Fnu = Ju + m,R,R,, Au. 
Then the inertia tensor .%, of the body 7% is given by additivity; 
fw Ue 1g 


= 2F,u SF 2 m,R,R,Au, (225) 
where, of course, 
2=m,R 
ae MWK, 


A 
ail 
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is the center of mass of the body. The inertia tensor ¥, is referred to the 
origin, so the inertia tensor ¥ with respect to the center of mass is determined 
by the parallel axis theorem 


F,u = Ju + mRRau. (2.26) 


It will be noted that the parallel axis theorem (2.26) can be interpreted as an 
additivity relation, for the last term in (2.26) is the inertia tensor with respect 
to the origin of a single particle located at the center of mass R. 

The need to use (2.26) can be avoided by selecting the center of mass of the 
body 7% as origin before making the calculation, so (2.25) gives ¥ directly; 
thus, 


N 
Ju = = (F,u + m,R,R,Au), (2.27) 
=)! 


where R, is the directance from the CM of the entire body to the CM ot its 
k-th part. 

For a continuous subdivision of a body into parts the sum (2.27) goes over 
to an integral 


Ju = is dm, (%,u+ R,R, au). (2.28) 
a 


For each value ot the parameter A in the range a <A < B, ¥Y, is the inertia 
tensor per unit mass of the body part with mass dm, and center of mass R,. 
Equation (2.28) is a powerful means for calculating inertia tensors. It enables 
us to calculate the tensors for 2-dimensional bodies from the tensors for 
1-dimensional bodies, and then calculate the tensors for 3-dimensional bodies 
in from those of 2-dimensional bodies. This is best understood by working out 
some examples. 


Example 2.4. Homogeneous triangular lamina 


Let us see the additivity principle to calculate the inertia tensor of a homo- 
geneous triangular lamina. We can subdivide the triangle into narrow strips 
indexed by a parameter A as indicated in Figure 2.3. We have already 
determined in (2.14) that the mass of a strip is dm, = 2mA dA, and we know 
from (2.15) that the CM of the triangle is ; a, , so the relative CM of the strip 
isR, = (A- 3 )a,. From the expression (2.19) for the inertia tensor for a rod, 
we can write down the inertia tensor per unit mass for the strip 


F,u = + (Aa) (Aa_)au. 
Therefore (2.28) gives us 


Fu = | em an ( 4 aaau+ (2-2) wana | 
0 i 
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Carrying out the elementary integrations, we obtain 
m 1 
Ju =| (aaau + + a,a, Au) (2.29) 


for the inertia tensor of a homogeneous triangular lamina. 

The inertia tensor (2.29) can be expressed in forms that look more sym- 
metrical as shown in the exercises, but (2.29) is preferable for some purposes. 
For example, for an isocelles triangle we have a,-a = 0, and (2.29) shows 
immediately that a, and a_ are the principal vectors of the triangle with 
principal values given by 


az 
ma_ 
Ja, = Ja, 


6 


Ja = [2a | a. (2.30) 


Of course we would recognize that a, and a are principal vectors from 
symmetry principles, and that is sufficient reason to express the inertia tensor 
in terms of them. 


Example 2.5. Elliptical lamina 


For calculating the inertia tensor of a homogeneous elliptical lamina, the 
additivity formula (2.28) is not very helpful. It is easier to make a direct 
evaluation of the integral 


Ju = | men rrau. (23)) 


Let the semi-major and minor axes of the ellipse be specified by vectors a and 
b. The points in the ellipse can be parametrized by the equation 


r= ax + by, (2.32a) 
where 
x?+y?=/)'?<1. (2732) 


This reduces the integration over the ellipse to integration over a circular 
disk, which is readily carried out by using polar coordinates A and @. Thus, 
the element of area of the ellipse is dA = (a dx) (b dy) = abd dd d@, so the 
area of the ellipse is 


I 20 
A= abl" aan | d@ = xab, 
0 0 
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in agreement with our calculation in Section 4.2 by a different method. Now, 
when (2.32a) is substituted into (2.31) we obtain 


Ju = 7 (aaau [« dx dy + bbau [» dx dy), 


for it is obvious by symmetry that 
i xy dx dy = 0. 
Also by symmetry, we have 
[aca =[ yravay=4 wide dy=$ [' aan [ap = I. 
0 
Thus, we obtain 
m 
Ju = rs (aaau + bbau) (2.33) 


for the inertia tensor of a homogeneous elliptical disk. 
Note that before the last integral over A was carried out we could have 
written the inertia tensor in the form 


2 
Ju = ir dm, ~~ (anu + bbau), (2.34) 
0 


where 


dm, = p= = 2mi di. 
0 TTA 


So by using the additivity principal in reverse we can conclude that the inertia 
tensor for a homogeneous elliptical loop is 


fu = (aaau + bbau). (2735) 


Note also that the inertia tensor for a flat elliptical ring can be calculated from 
(2.34) by raising the lower limit of integration. 


Matrix Elements and Moments of Inertia 


In most physics books the inertia tensor is calculated and used in matrix form 
only. We have developed techniques for dealing with the inertia tensor as a 
single entity, but we must consider its matrix elements to make contact with 
the literature. 
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Let {@,} be any righthanded basis of vectors. According to the definition of 
the inertia tensor (2.18), 


IG; = | ameena; = { amir’ —r,¥), (2.36) 


where r, = r:o,. From this we obtain the matrix elements 


I = 6; ($o,) = [omeonrytene,) -| dm(r°j, — r;1,). (2.37) 


The conventional way to calculate an inertia tensor is by evaluating the 
integrals (2.37) to get its matrix elements. The diagonal elements of the 
matrix J;, are called “‘moments of inertia’’. In particular, 


1 = [emten ie i dm(r? + r3) (2.38) 


is “‘the moment of inertia about o,”’. 

Of course, the numerical values of the matrix J, depend on the basis {a,} 
to which it is referred. The matrix takes its simplest form when referred to a 
basis of principal vectors e,, for then 


Ie; = Tx; (2.39) 
which produces the diagonal matrix 
e;( fe,) = I, Oi; (2.40) 


where, according to (2.37) and (2.38), 
t= | armirnegi (2.41) 


The principal values I,, of theinertia tensor are called principal moments of 
inertia. 

The trace of the inertia tensor Tr ¥ is defined as the sum of diagonal matrix 
elements. From (2.37) and (2.40) 


Tr §=1, +1, t+1l,=1+1+1,=2 | dmr?. (2.42) 


The right hand side of (2.42) shows that the trace is independent of basis, so 
(2.42) relates any set of diagonal matrix elements to the principal moments of 
inertia. This relation is sometimes useful for calculating principal values. 

Equation (2.38) shows that the moments of inertia J,, must be positive 
numbers, but the special nature of the inertia tensor puts further restrictions 
on their relative values, as we shall now show. From (2.38) and (2.42) we 
have 


iS = | dmer?—r3) = pate sae bea I) ~ | amr. 
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Hence, 
1, +L,-h,=2 [ amr: = 0. (2.43) 


The integral here will vanish only if r, = r-o6, = 0 for all points in the body, 
which can occur only if the points lie in a plane with normal o,. Therefore, for 
a plane lamina with normal o,, (2.43) reduces to 


Ieee = 0. (2.44) 


Of course, a plane lamina is only a convenient mathematical idealization 
justifiable for bodies of negligible thickness, so for a real body (2.44) can be 
only approximately true. 

The relation (2.43) holds for any orthonormal basis, so it applies to the 
principle moments of inertia; thus, 


hea 0: (2.45) 


Furthermore, this relation holds for any permutation of the subscripts, so the 
sum of any two principal values can never be less than the other principal 
value. 


Moment of Inertia and Radius of Gyration About a Line 


So far we have discussed moments of inertia only as parts of a complete 
matrix of inertia. But when a body is rotating about a fixed axis with known 
direction u, only the moment of inertia about that axis is of interest. The 
moment of inertia about (a line with direction) u is 


i, =u( fu) = [am raul’. (2.46) 


Equation (2.46) implies that /, is always a positive number for a real body. 
Vanishing moments of inertia occur only for mathematical idealizations such 
as a rod without thickness, as described by (2.19). 

Note that ry, = |rau| is the perpendicular distance of the point r from the 
line with direction u passing through the origin. Therefore, the integral in 
(2.46) is a sum of the squared distances rj weighted by the mass of each 
point. Instead of the moment of inertia /,, it is sometimes convenient to use 


the radius of gyration about u defined by 
i, =e. (2.47) 


The radius of gyration k,, can be interpreted as the distance from the axis of a 
single particle with mass m with the same moment of inertia as the body with 
tensor ¥ in (2.46). This follows from the fact that the inertia tensor for a 
particular at R is Ju = mRRau, so 


5 
3 


u- Zu = m |Rau 
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which agrees with (2.47) if k, = |Rau|. Thus, we have a kind of equivalence 
in inertia of a whole body to a single particle. Our next task will be to 
investigate this kind of equivalence systematically and in complete generality. 


Classification of Rigid Bodies 


Two rigid bodies are said to be equimomental if they have the same mass and 
principal moments of inertia. A glance at the equations of motion for a rigid 
body in Section 7-1 shows that they are identical for equimomental bodies 
subject to equivalent forces. Thus, equimomental bodies are dynamically 
identical, though they may differ greatly in size and shape. The shape of a 
body is relevant to its motion only in the possibilities it gives for applying 
contact forces. 

It is always possible to find a small set of particles equimomental to a given 
continuous body. For example, from the inertia tensor (2.19) for a rod we can 
see that the rod is equimomental to a system of three particles, one particle of 
mass m/6 at each of the points ta and one particle with mass 2m/3 at the 
origin. The particles are symmetrically placed so their centroid will be at the 
origin, and the particle at the origin is needed so the system will have the 
correct total mass. It will be noted that other sets of particles will give the 
same result, for instance, two particles with mass + m at the points ta/V 3. 

Rigid bodies fall into three dynamically distinct classes, depending on the 
relative values of their three principle moments of inertia /,, /,, J,. A body is 
said to be 

(1) centrosymmetric if all its principle moments are equal (/, = J, = J), 

(2) axially symmetric if it has exactly two distinct principal moments (e.g. 

L ee 

(3) asymmetric if it has three distinct principal moments (e.g. J, > J, > J,). 
This terminology necessarily differs from that in other books, since there is no 
standard terminology available for these three classes. 

The inertia tensor of a centrosymmetric body is determined by a single 
number, its principal moment of inertia / = J, = J, = /,. Every line through 
the center of mass is a principal axis, so the dynamics of the body is 
completely independent of its attitude, that is, the dynamics would be 
unaffected by a finite rotation of the body about any axis through the center of 
mass. This rotational invariance of a centrosymmetric body is a dynamical 
symmetry and should not be confused with the geometrical symmetry of a 
body. The bodies listed in Table 2.1 are all centrosymmetric, so they have the 
same dynamical symmetries, but their geometrical symmetries are quite 
different. For example, the sphere is geometrically symmetrical under any 
rotation about its center, so its dynamical symmetry is identical to its geo- 
metrical symmetry. However, the cube is geometrically symmetrical only 
under particular finite rotations, such as a rotation by 27/3 about one of its 
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TABLE 2.1. The moment of inertia for some centrosymmetric homogeneous bodies 


Body Moment of inertia / 

(Solid) sphere (radius a) = ma’ 
Hollow sphere = ma° 
Hemisphere z ma‘ 
Hemispherical shell = ma’ 
(Solid) cube (side a) sma 
Regular tetrahedron (side a) a 
Right circular cylinder of 

radius a and height aV 3 +ma* 
Cone of radius a and 3ma* 

height 4+q 10 


diagonals. Nevertheless, the geometrical symmetries of the cube imply that 
all its principal moments of inertia are equal by theorems we have established 
above, so its dynamical symmetry is a consequence of its geometrical sym- 
metry. For other bodies in Table 2.1, such as the hemisphere or the cylinder, 
geometrical symmetry alone is not sufficient to determine dynamical sym- 
metry. 

Inertia tensors of some axially symmetric bodies are expressed in “canoni- 
cal form” in Table 2.2. These tensors are completely determined by a unit 
vector e along the “dynamical symmetry axis” and the two distinct principal 
moments of inertia /, and /, = /,. The vector e is the principal vector of the 
inertia tensor .¥ corresponding to the principal value /,, so /, is the principal 
value of any vector orthogonal to e. To determine the effect of % on an 
arbitrary vector u, we break u into components parallel and perpendicular to 
e. Thus, 


Ju = #(u, + u,) = Zu, + %u, = Ju, + fu, = /,ee-u + /,eeau. 


The dependence on e can be simplified by using eeau = e(eu—e-u) = 
u — eeu, SO 


Ju = Lut (/,-1,)eeu. (2.48) 


This is the “‘canonical form” adopted for the tensors in Table 2.2. There are 
other interesting forms for the inertia tensor. For example, substitution of 
e-u = +(eu + ue) into (2.48) yields the ‘‘symmetrical”’ form 
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TABLE 2.2 Inertia tensors for some axially symmetric homogeneous bodies 


Body 


Flat circular 
ring 


Cylindrical 
Tube 


Ellipsoid of 
Revolution 


Circular 
Cone 


Half Conical 
Shell 


Half Cylindrical 
Shell 


Solid 
Semicylinder 


Half Torus 


e 


A 


CED. 


h 


Inertia tensor fu 


e 


=e +b? + th’)ju + (a + b?-th*ee-u] 


= [(a? + b?)u + (a? - b’)ee-u] 


ST (a? + 4h? yu + (a? - 4h? Jeu] 


<> e 7 Ua’ + 2h’)u + (a° - 2h*)ee-u] 
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TABLE 2.3. Inertia tensors of some asymmetric homogeneous bodies 


Body 


Parallelepiped 


Elhipsoid 


Ellipsoidal 
Shell 


Elliptical 
Cylinder 


Elliptical 
Cone 


Hollow 
Elliptical Cone 


Triangular 
Lamina 


Triangular 
Prism 


Inertia tensor 4u 
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= [aaau + bbau + ceaul 


= [aaau + bbau + ccau] 


7 laaau + bbAu + ccau] 


7 [aaau + bbau + + hhau] 


ST aaau + bbau + 4hhau] 


7 [aaau + bbau + 2hhau] 


m 


= laaau + bbau + ccau] 


> [aaau + bbau + ceau + hhau] 
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Ju = (+4 Ju +(454 Jeune, (2.49) 


However, we shall see that (2.48) is most useful when we study dynamics. 

The “canonical forms” for the inertia tensors of asymmetric bodies in Table 
2.3 have been chosen to show their relation to the geometrical structure of the 
bodies. The principal axes and principal values are not evident from these 
forms, but must be calculated by the methods of Section 5-2. Some examples 
are given in the exercises. We shall see that it is important to know the 
principal axes when analyzing the motion of a body. 


7-2. Exercises 


(2.1) Calculate the centroid and moment of inertia of a hemispherical 
shell. 

(22) Find the centroids of a solid circular cone, a hollow circular cone, 
and half conical shell (Origin as indicated in Table 2.2). 

(255) Find the centroids of a solid semicylinder and a half cylindrical shell 


(Origin as indicated in Table 2.2). 

(2.4) Find the centroid of a half torus (Origin as indicated in Table 2.2). 
ne) Find the centroid of a spherical cap cut from a sphere of a radius a 
by a plane at a distance b from the center of the sphere. 

(2.6) Determine the centroid and volume of the intersection of a sphere 
of radius a with a cone of vertex angle 2¢ and vertex at the center of 
the sphere. 

7) A cylindrical hole of radius b is cut from a cube of width 2a with an 
axis at a distance d from the center of the cube and perpendicular to 
a face of the cube. Find the mass, centroid, and inertia tensor of the 
resulting body if it has uniform density 9. 


(2:5) Calculate the moment of inertia of a homegenous cube directly from 
Equation (2.18). 
(2.9) Let three intersecting edges of a cube be designated by vectors a,, 


a,,a,, with a* = a? = a3 = a}. Calculate the inertia tensor about a 
corner of the cube. Determine its principle vectors and principle 
values and its matrix elements with respect to the edges of the cube. 
(2.10) Find the moment of inertia of a rectangular parallelopiped with 
edges a, b, c about a diagonal. 
(2.11) A particle of unit mass is located at each of the four points 
G, — 56, — 6), 3( 6.64 6, 4.0;), 6, + 6, +.56,,—56, 4 6, — 6. 
Find the centroid and inertia tensor of the system. 
(2.12) A homogeneous rod of mass m and length 2a is rotating with 
angular velocity w about one end. What is its kinetic energy? 
(2.13) Three particles of unit mass are located at the points r, = ao,, 
r, = ao, + 2a0,, r, = 2a0, + ao,. Find the principal moments of 
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(2.14) 


(2.15) 


(2.16) 


(ug) 


(2.18) 


(2.19) 


(2.20) 


(2.21) 


(2.22) 


(2.23) 


inertia about the origin and the corresponding principle values. 
Express the inertia tensor about the origin in terms of its principal 
vectors. 
Calculate the inertia tensor for an ellipsoid in the form given in 
Table 2.3 by generalizing the method of Example 2.5 to reduce 
integration over the ellipsoid to easy integrations over a sphere. 
For a given body, prove that if a line is a principal axis for one of its 
points, then it is a principal axis for all of its points and it passes 
through the centroid of the body, and conversely. 
Prove also that a line with direction u and distance d # 0 from the 
centroid can be a principal axis for one of its points if and only if 
daun fu = 0). 
Derive the inertia tensor for a homogeneous triangle in the form 
shown in Table 2.3. Show that its principle values are 


ai at+b+c 
~ 36 2 


Ifr,,r,,r, locate the vertices with respect to the centroid, show that 
inertia tensor of a triangle of mass m can be put in the form 


i VO a po te — pc sea = aoe 


m 3 
Ju = 36 j=, rereAU: 


Leta,,a,,...,a, designate the edges of a homogeneous tetrahedron 
and Jet r,,r.,1r,, 1, locate the vertices with respect to the centroid. 
Show that its inertia tensor ts 


6 m 4 
30 ee = 30 ae 
A homogeneous triangle of mass m is equimomental to a system of 
four particles with one at its centroid and three of the same mass 
either at its vertices or at the midpoints of its sides. Find the particle 
masses in each of the two cases. 
Show that a homogeneous tetrahedron of mass m is equimomental 
to a system of six particles with mass m/10 at the midpoints of its 
edges and one particle of mass 2m/S at its centroid. 
Prove that every body is equimomental to some system of four 
equal mass particles, and that every plane lamina is equimomental 
to three equal mass particles. 
Legendre’s Ellipsoid. Prove that every body is equimomental to an 
ellipsoid. 
Inertia tensors # and ¥£' are related by the displacement equation 


Ju = 


rou = fu + mRRau. 


Their principle vectors and principle values are specified by 
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Je, = 1,e,, #'e, = I',e;. 


For R-e, = 0, prove that: 

(a) e, = e, and J, = I, + mR? 

(b) For k = 1, 2 andi = e,e,, 

2mR,R, 
jf) Aa 


+3{ (1-1, + m(Ri- R3))° + 4P Ri R3}!° 


e. = e,e?, where tan2¢= 


(c) | i | eo wane 
if 2 


where R, = R-e,. 
(2.24) For the plane lamina in Figure 2.5, 
calculate the principal moments of 
inertia and the angle @ specifying 
the relative direction of the princi- 
pal axes. 


7-3. The Symmetrical Top 


In this section we study the rotational motion 
of an axially symmetric body, which might be 
referred to as a top, a pendulum or a gyro- 
scope, depending on the nature of the mo- 
tion. From Section 7-2, we know that the inertia tensor for an axially 
symmetric body can be put in the form 


l= w= I1o+ (1,-I)w-ee, (3:1) 


Fig. 2.5. Plane lamina indicated by 
shaded area. 


where /, and J, = J, = J are the principle moments of inertia, and e is the 
direction of the symmetry axis. According to the parallel axis theorem, this 
form for the inertia tensor will be preserved by any shift of base point along 
the symmetry axis; the effect of such a shift is merely to change numerical 
values for the moments of inertia. Consequently, the results of this section 
apply to the motion of any symmetrical top with a fixed point somewhere on 
its symmetry axis. 

Recall that the rotational equations of motion consists of a dynamical 
equation (Euler’s equation) 


i=r | (3.2) 


for the angular momentum | driven by an applied torque F, and a kinematical 
equation 


R=+Rio (3.3) 


for the attitude spinor R depending on the rotational velocity w. The spinor R 
determines the principle directions e, of the body by 
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ei R‘o,R (3.4) 
for fixed o,. For an axially symmetric body we write e = e,, so 
e= R'a,R. (3.5) 


This relation couples the two equations of motion (3.2) and (3.3) by virtue of 
(3.1), and further coupling derives from the fact that an applied torque 
usually depends on R as well. Such coupling makes the equations of motion 
difficult to solve. Consequently, closed solutions exist for only a few cases, 
and more general cases must be treated by approximations and numerical 
techniques. Fortunately, the simplest cases are the most common, and they 
provide a stepping stone for the analysis of more complex cases, so our 
primary aim will be to master the simplest cases of rotational motion com- 
pletely. 


Qualitative Features of Rotational Dynamics 


Before getting involved in the details of solving Euler’s equation, it is a good 
idea to identify the major qualitative features of rotational dynamics. Sup- 
pose a body is spinning rapidly with angular speed w about a principal 
direction e with moment of inertia /,. Then its rotational velocity is w = we 
and its angular momentum isl = %w = w fe = w/,e. This state of affairs will 
persist as long as no forces act on the body, since | = 0. Now, if a small force 
F is applied to a point r = re on the axis of rotation, Euler’s equation can be 
put in the approximate form 


l= /.we=reXF, 
or 


he (= x e. (3.6) 


I,@ 


This equation displays two major features of gyroscopic motion. First, it tells 
us that a force applied to the axis of a rapidly spinning body causes the body 
to move in direction perpendjcular to the force. This explains why leaning to 
one side on a moving bicycle makes it turn rather than fall over. Such 
behavior may seem paradoxical, because it is so different from the behavior of 
a nonspinning body. But it can be seen as a consequence ot elementary 
kinematics in the following way. Consider the effect of an impulse F At 
delivered in a short time to the spinning disk in Figure 3.1. Since the body 1s 
rigid, the torque about the center of the disk exerted by the force F is 
equivalent to the torque exerted by a force F’ applied at a point on the rim as 
shown. But the effect of an impulse F’ At is to alter the velocity v of the rim 
by, an amount Av, thus producing rotational motion perpendicular to F as 
asserted. 

The second thing that Equation (3.6) tells us is that the larger the values of 
I and w, the smaller the effect of F. This is sometimes called gyroscopic 
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stiffness, and it accounts for the great 
directional stability of a gyroscope. The 
dynamical properties of a slowly rotating 
body are quite different. Indeed, for val- 
idity of (3.6) it is necessary that 
|rF/I,w| < w. This condition determines 
what is meant by a small force in gyro- 
scopic problems. 

Now let us turn to a detailed analysis of 
the equations of motion and their ramifi- M 
cations. Far Av 


Fig. 3.1. Deflection of a spinning body by 
an impulse. 


FAt 


Free Precession 


A spinning body is said to be spinning freely if the resultant torque TF on it 
vanishes. According to Euler’s equation! = I, then, the angular momentum | 
of a freely spinning body is a constant of the motion. To obtain a complete 
description of motion, it is still necessary to determine the attitude R as a 
function of time by integrating the kinetic equation R = + Riw. To do this, we 
need to know the rotational velocity w, but that can be obtained from (3.1). 
Thus, we find 


Ll I-b/te 
as 7, (te). (7) 


Now, using e = R'a,R and RR‘ = 1, we can put the kinetic equation of 
motion in the form 


. tod (/-J,) lee 
R=7Rio=7RI 7 a Fi 
at IS) bes 
=4Rit+}! 7 DT ia,R. 
This is an equation of the form 
R=+io,R ++Rio,, (3.8a) 
where 
oO, = “ and @, = a e:O,0;. (3.8b) 
Ed 


For a freely spinning body, both w,, and w, are constant, so (3.8a) has the 
elementary solution 


R = elMiot Re O2iart, (3.9) 


where R, describes the attitude at time t = 0. 
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From the solution (3.9), we find that the motion of the body’s symmetry 
axis is given by 


e= Rto,R =e OR)iart e eleiert (3.10) 


where e, = RSI; is the initial direction of the symmetry axis. Equation 
(3.10) tells us that the symmetry axis precesses about the angular momentum 
vector | with a constant angular velocity @, = I//, as shown in Figure 3.2a, b. 
Therefore, by observing the free precession of the symmetry axis, the angular 
momentum can be determined. 


Fig. 3.2a. Free precession for an Fig. 3.2b. Free precession for 
oblate body (w, < 0). a prolate body (w, > 0). 


We can now supply the physical interpretation of the solution (3.9). The 
first factor in (3.9) describes a rotation of the body with constant angular 
speed w, about its symmetry axis positioned along some arbitrary direction 
@, = 4,. The second factor R,, “‘tilts” the symmetry axis from a, to a specified 
direction e, = R}o,R,. Of course, we could take R, = 1 if we chose o, = e,. 
The third factor in (3.9) describes a precession of the body. Thus, the solution 
(3.9) describes @body spinning with angular speed w, about its symmetry axis 
while it precesses with angular velocity @,. This motion is called Eulerian free 
precession. 

The resultant rotational velocity w can be determined from (3.8a), (3.9) 
and (3.10); thus, 


2 RIR = Rtw,R + w,, = we + a, 
i 


= ¢ ier (ge, + were”, Gal) 


o= 


This tells us that w precesses about | = /w, along with e, with the three 
vectors @, 1, e remaining in a common plane (@alae = 0). The precession 
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of e and @ is illustrated in Figure 3.2a for an oblate body (/ > /,) such as the 
Earth or a disk, and in Figure 3.2b for a prolate body (/ < /,) such as a 
football. 

Free precession of the Earth produces an observable variation in latitude. 
From (3.9) and (3.11) we find that the resultant rotational velocity w’ in the 
body frame of the Earth is 


o' = RoR == RR 
L 


= eM2iotR (a, e, + a) Rigvarer, (3.12) 


This says that w’ precesses with angular velocity —w, about the Earth’s 
symmetry axis @, = 6,. The vector @’ is the direction of the celestial north 
pole. On a short timelapse photograph with a camera pointing vertically 
upward the star trails are arcs of concentric circles centered on the celestial 
pole, so the direction of @’ relative to the earth can be measured with great 
accuracy. Successive measurements determine its motion relative to the 
Earth. 
One complete rotation of the Earth about the celestial pole defines 
2m 20 


1 siderial day = ee 
3 


From an independent analysis of perturbations by the Moon (Chapter 8), it 
has been determined that //(/, - /) = 303. Therefore, (3.12) and (3.11) 
predict that the celestial pole will precess about the Earth’s axis with period 


a ee 5(autay se ; 
fe ays (3313) 


Empirically, it is found the celestial pole precesses irregularly about the 
Earth’s axis, tracing in one year a roughly elliptical path on the Earth’s 
surface with a mean radius of about four meters. Analysis of the data shows 
that the orbit has two distinct periods, one of 12-month and the other of 
14-month duration. The 12-month period is accounted for by seasonal 
changes in the weather, primarily the formation of ice and snow in polar 
regions. The motion with a 14-month period is called the Chandler wobble 
after the man who discovered it. This is identified with the Eulerian wobble 
predicted above. 

The large discrepancy between the observed 14-month period for the 
Chandler wobble and the prediction of a 10-month period in (3.13) deserves 
some explanation. It results from the fact that the Earth is not an ideal rigid 
body. Without developing a detailed theory, we can see qualitatively that the 
elasticity of the Earth will lengthen the period. Consider what would happen 
if the rotation of the Earth were to cease. If the Earth were a liquid body, it 
would clearly assume a spherical shape when the centrifugal force is turned 
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off. But the Earth, or, at least, its shell, is an elastic body, so it will tend to 
retain its shape, though its oblateness, as defined by (4 —£)/¥% will decrease. 
Similarly, the oblateness will be decreased, because the Earth’s instantaneous 
axis of rotation is not along the polar axis, and (3.13) shows that this will 
lengthen the period of the wobble. A quantitative analysis of this effect is a 
complex problem in geopkysics, requiring an analysis of fluctuations due to 
earthquakes, tidal effects, seasonal motions of air masses, etc. It is an active 
area for geophysical research today. 


Reduction of the Symmetric Top 


The analysis we have just completed has a significance that goes well beyond 
free precession. For it suggests a method of reducing the equations of motion 
for an axially symmetric body to the simpler equations of motion for a 
centrosymmetric body. Notice that Equations (3.8a, b) are valid for any 
motion of an axially symmetric body. Furthermore, we can separate (3.8a) 
into the two equations. 


R, =+io,R,=+ Ria, (3.14a) 


(3.14b) 
with 

R=R,R,. 
Now (3.14b) is the attitude equation for a centrosymmetric body with mo- 
ment of inertia /, and it can be solved independently of (3.14a) using the 
equation of motionl =T. 


Then the solution to (3.14a) can be found from (3.14b); specifically, it has 
the form 


Rese (P31, (3.15a) 
where, by (3.8b) 

b= [io |) Eee, a. (3.15b) 
and 

e= Rto,R= Rio,R,. (3al5¢) 


Note that the integral form of (3.15b) even allows time variations in / and J, to 
be taken into account. 

Let us call the reduction of the equations of motion just described the 
reduction theorem for a symmetric top, and let us refer to the motion 
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described by (3.15a) as the Eulerian motion, since it generalizes the Eulerian 
free precession. The reduction theorem tells us that the motion of an axially 
symmetric body can be obtained from the motion of a centrosymmetric body 
simply by superimposing the Eulerian motion. It also tells us that the Eulerian 
free precession is maintained even in the presence of a torque I if F-e = 0; for 
then we know from Section 7-1 that e-w, = /e-l is constant and (3.15b) 
integrates to give us the same result as the one obtained for zero torque. This 
provides justification for ignoring torques when analyzing the Eulerian pre- 
cession of the Earth. 


The Spherical Top 

From now on we restrict our analysis to the reduced equations of motion 
L=Jo=5ocR. (3.16) 
R=} Rio. (3.17) 


These are general equations of motion for the attitude of a centrosymmetric 
body subject to a single force acting at a point r fixed in the body. We know 
from Section 7-2 that every centrosymmetric body is equimomental to a 
spherical body. So we refer to (3.16) as the dynamical equation of motion for 
a spherical top. The reduction theorem tells us that from the solution of the 
equations of motion for a spherical top, we get the solution for an axially 
symmetric body simply by multiplying by the Euler factor (3.15a). Actually, it 
won't be worth the trouble for us to include the Euler motion explicitly, 
because it affects only the rate of rotation about the symmetry axis. The 
motion of the symmetry axis is the most prominent feature of rotational 
motion, and is completely described by the reduced equations. 

It should be noted that (3.16) and (3.17) are coupled equations, since the 
direction of r depends on the attitude R. Also, to relate the solution to the 
axially symmetric case, the direction of r must be that of the symmetry axis 


e =e, = R'a,R, 


so it will be unaffected by the Euler factor. To make this explicit and lump all 
the constants together, let us write r = re and write (3.16) in the form 


o =eXG, (3.18) 
where the effective force G is defined by 
= a PG 1s) 


Now a significant mathematical advantage of the reduction theorem can be 
seen. It enables us to combine the coupled equations (3.17) and (3.18) into a 
single spinor equation of motion for the body. We get the equation by 
differentiating (3.17) and using (3.18); thus, 
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RB 


7 R(id@ — +o) = +R(eaG - +0”) 

= >R(eG — eG —+ a") 

= +6,RG-+R(w' + eG). (3.20) 
To make this a determinate equation, we need to express the last term as a 
definite function of R. This can be done in several different ways. We can 


eliminate either w° or G-e in favor of the other by noting that for constant G 
Euler’s Equation (3.18) admits the effective energy 


E=>0' =eG (3.21) 
as a constant of the motion. Either w or e-G can be expressed in terms of R 
by using 
w* = 4(RtR)?, (4222) 
or 
G = (Rto,R)-G = (Rto,RG),. Gerz5) 


To keep these alternatives in mind, let us write the spinor equation of motion 
for a spherical top in the standard form 


R=+0,RG-—4LR, (3.24) 
where ZL can have the various forms 
=+w' +eG = E+ 2e-G= w'-E, (325) 


to be used in (3.24) along with (3.22) or (3.23). 

The spinor equation of motion (3.24) can be put in many alternative forms 
by various parametrizations of the spinor R. For example, we can use the 
Euler parameters introduced in Section 5-3 by writing 


R=a+ ip. (3.26) 


This has the advantage of explicitly exhibiting the direction B of the instan- 
taneous axis of rotation. Substituting into the spinor equation (3.24) we 
obtain 


& + iB =+(ao,G + io,BG) -+L(a + ip). 


Let us simplify this by choosing ¢, = G, which we are free to do. Then 
separating scalar and bivector parts and using GB = BG + 2GAB we obtain 
a pair of coupled equations of motion for the Euler parameters 


a+ ;(L-G)a=0, | (3.27) 
B + BAGG + +(L—G)B=0. (3.27b) 


The Euler parameters are related by R'R = a + B° = 1, but the Equations 
(3.27a, b) look more complicated if a@ is eliminated in favor of B. As a function 
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of the Euler parameters, the direction of the symmetry axis is 
e = (a— ip)G(a + ip) = oG + 2aB x G + PGB 
= G + 2a8 x G + 2paGB. . (3.28) 


We shall find the Euler parameters useful for small angle approximations and 
motion in a vertical plane, but different parametrizations are better in other 
situations. 
The spinor equation of motion (3.24) 
would be interesting to study for a variety A 
: o. =& 
of forces, but we shall use it only for the : 
so-called Lagrange problem, in which case 
G and E are constants. To relate the equa- 
tions to a concrete problem, let us consider 
the object in Figure 3.3, sometimes called 
a gyroscopic pendulum. The object is an 
axially symmetric body consisting of a disk 
attached to one end of a rigid rod along its 
axis. The other end of the rod is held fixed, 
but the object is free to spin about is axis 
and rotate in any way about the fixed 
point. Neglecting friction, the entire 
torque about the fixed point is due to the 
resultant gravitational force F = mg acting 
at the center of mass at adirectancer from _ Fig. 3.3. The gyroscopic pendulum. 
the fixed point. 
Now let us turn to the problem of finding solutions to the spinor equation of 
motion. We begin with the simplest special cases to gain insight. 


Small Oscillations of a Pendulum 


When we write e = R'GR, the spinor R = e“'?”® measures the deviation of the 
symmetry axis e from the downward vertical G. For small deviations from the 
vertical, the angle é is small, so we can use the power series expansion of the 
exponential function to get the approximate expression 


R = eft = | + Fis, (3.29) 


good to first order for |; ¢| < 1. Note that right side (3.29) is an expression 
for R in terms of Euler parameters a = 1 and B = +e, so we can insert these 
values into (3.27a, b) to obtain L = G and the first order equation for «: 

é + eaGG = 0. (3.30) 


To understand equation (3.30), notice that eaGG is merely the projection 
of € onto the horizontal plane, so the vertical component of é satisfies é, = 0. 
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The solution of this equation will be incompatible with the assumption that ¢ 
is small unless é, = 0, so we may assume that ¢, = 0. Note, however, that if 
the object were hanging vertically and spinning about its symmetry axis with 
constant angular velocity @, = wG, then the rotation angle would be 
€, = wf. Therefore, the assumption that the angle ¢ is small implies that the 
object is not spinning about its axis, so its motion is that of a pendulum. 
For horizontal ¢ we have €-G = 0, so eaG = &G, and (3.30) reduces to 


é+Ge=0. (3.31) 


This will be recognized as the equation for a harmonic oscillator, so the 
solution is an ellipse in the horizontal plane with the parametric form 


é=acosG'’t+ bsin G"t. (3.32) 
For horizontal e, the Equation (3.28) for the axis reduces to 
e=G+exG+teG (3.33) 


This gives us the value of e to second order from the first order value of the 
angle e. The term e X G = e(-iG) differs from e only in being rotated by 2/2 
in the horizontal plane. Therefore, along with (3.32) the first two terms in 
(3.33) describe the orbit of the vector e to first order as an ellipse in a 
horizontal plane with its center at G. The second order term in (3.33) is 
directed entirely along the vertical; it has the effect of bending the elliptical 
orbit on the plane to fit it on the unit sphere and make it compatible with the 
condition e* = 1. 

The solution (3.32) tells us that the motion is periodic with period T given 
by 


Ee js 


where the value of G is taken from (3.19). According to the parallel axis 
theorem, / = J, + mr* = mr; + mr?’ where /, is the moment of inertia at the 
center of mass and s, is its radius of gyration. Therefore, 


(2) = __mmg (3.34) 


T = on ye cian (3.35) 

rg 
For r; << r* this reduces to the formula for the period of a simple pendulum, 
for which the mass of the bob is supposed to be concentrated at a single point. 


Steady Precession 
Now let us return to the study of the exact spinor equation 
R = +6,RG -7LR, 
with L given by (3.25). The terms on the right must be cancelled by the Ron 
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the left, so 6, and G must be produced by differentiation. This suggests that 
we look for a solution of the form 


R=R,R 
where 
R, =+io,R, and R, = +R,io,. 
Then 
R=+io,R ++Rio,. (3.36) 


If a solution with constant w, and w, is possible, then 
R =—+@,Ro, —+(w? + w3)R = +0,RG -FLR. 


From this we see that a solution ts achieved if 


a 


@,=0,, @,=G, (3.37) 
w,@,=—-G, (3.38) 
+(w; + w) = L, (3.39) 


where L = E + 2G-e must be a constant. so G-e is constant. The last two 
equations can be solved to get w, and w, in terms of G and L but a different 
expression for w, and w, is more helpful. 

Inserting (3.37) and (3.38) into (3.36). we obtain the total rotational 


velocity w of the body 
o = et R'*R = R'o,R + wo, = we + o.G. (3.40) 

l 

where e = R'a, Ras before. The total angular speed about the symmetry axis 

tS 


woe =0,+ 0,6. (3.41) 


The right side of this equation shows that @-e is a constant of the motion, 
though we proved that from more general considerations in Section 7-1. Now 
we solve (3.38) and (3.41) for w, and w., and we find two pairs of solutions 


w, = (we F [(w-e) + 4G-e]"”) (3.42) 


we + [(w-e) + 4 Ge]'” 
2Ge . 
The reader may verify that, if the expression (3.40) for w is inserted into 
L = 7° + eG, then (3.39) is obtained as an identity when (3.38) is used. 
Therefore (3.39) does not give us any additional information about ,. 
Having established that the conditions for a solution have been satisfied, we 
can write the solution explicitly: 


wo, = (3.43) 
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R = R,R, = ear. elvan (3.44) 


and we can take R, = 1 if we set o, = e,, the initial value for the symmetry 
axis. The solution (3.44) has exactly the same mathematical form as the free 
precession solution (3.9). Both solutions describe a body spinning with 
angular velocity w, while it precesses steadily with angular velocity w,. 
However, their physical origins are totally different. The solution (3.44) 
describes a forced steady precession, because it results from an applied 
torque. 

Of course, (3.44) is the reduced solution for an axially symmetric body, so 
to get the full solution we must multiply it by the Euler factor. According to 
(3.44), the angular speed of the Eulerian motion in this case is 


WO = (+4 Joe (3.45) 


The full solution for steady forced precession of an axially symmetric body is 
therefore 


R aa e(1/2)ien(con + wy) e'/2 viwant ; (3.46) 


The effect of the Eulerian motion is to shift the component of w along e to 
produce a new rotational velocity w'. The amount of the shift is given by 


lana (3.47) 


3 


o'-e= w+ oe = 

This relation makes it easy to convert results for centrosymmetric bodies to 

results for axially symmetric bodies. For example, using it in (3.43) we obtain 
I,a'-e + [I3 (w’-e)? + 41°G-e}]'” 

21G-e s 


wo, = (3.48) 


expressing the angular precession speed w, in terms of the actual rotational 
velocity w’ for an axially symmetry body. It will be noted that the conversion 
does not alter the functional form of the basic relations, so we might as well 
deal with the simpler relations in terms of @ and make the conversion only at 
the end of calculations when numerical results are desired. 

The expression (3.43) for the angular precession speed w, is the item of 
greatest interest here, because it describes the motion of the symmetry axis. 
Let’s see what it tells us about various special cases. For a rapidly spinning 
body the kinetic energy is much greater than the potential energy. Therefore, 
(w-e)? >> |4 G-e|, and we can expand the square root in (3.43) to get 


we 1 4Ge | 
= — fe | Se eed Ae 3.49 
2 2Ge | 2 (we) ( ) 


Thus, to a good approximation we have the two solutions 
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ee (3.50a) 
OD 
and 
peer. | (3.50b) 
. oe 


The solution (3.50a) is said to describe a fast top, because the precession 
speed is large. It describes an upright top it G-e < 0 and a hanging top (or 
gyroscopic pendulum) if G-e > 0. The two possibilities are shown in Figure 
3.4. The solution (3.50b) describes a slow top because the precession speed is 
small. The negative sign shows that the precession is retrograde, that is, in a 
sense opposite to the spin about the symmetry axis. The reciprocal relation 
w,w, = —G tells us that the spin w, , about the symmetry axis will be small for 
a fast top and large for a slow top. 


Slow Top Fast Top 


W,= 


Fig. 3.4. Steady precession. 


It should be appreciated that this analysis merely shows that the states of 
fast and slow steady precession are possible, without indicating the conditions 
under which they can be achieved. In fact, fast precession is comparatively 
difficult to achieve except under laboratory conditions, while slow precession 
is acommon phenomena observed in the motion of a child’s top, of the Earth, 
and of molecules. 

The Sun and the Moon exert a torque on the Earth as a result of the Earth's 
oblateness, which is measured by the fractional difference (J — /,)/J, of its 
moments of inertia. This produces a precession of the Earth’s axis about the 
normal to the ecliptic (the plane of the Earth’s orbit about the Sun). The 
intersections of the equator with the ecliptic are called equinoxes, so the 
effect is observed on Earth as a precession of the equinoxes on the celestial 
sphere, with different stars becoming the ‘pole star’ at different times. 
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However, the torque is so weak that the precessional period is about 25,000 
years. We shall see how to calculate this value in Chapter 8. Figure 3.5 shows 
the relation of the equinox precession to Eulerian precession. 

Note that steady precession is possible only if (we)? + 4 G-e > 0, since 
otherwise the square root in (3.43) is not defined. This implies that a certain 
minimum energy is needed for precession of a top in an upright position, since 
then G-e < 0. For an erect top, G-e = —1 and the condition that it will remain 
erect becomes strictly a condition its kinetic energy +w* = +(@-e)’ > 2G. 
When this condition is met, precession is indistinguishable from spin about 
the symmetry axis, and the top is said to be sleeping. 


geometric pole 


ecliptic pole celestial pole 


polar rotation 
(period 1 day) 


radius 4 meters 


Chandler wobble 
(period 436 days) 


equator 


Fig. 3.5. Gyroscopic motion of the Earth; Eulerian motion and precession of the equinoxes. 


For w-e = 0, the expression (3.43) reduces to 


w,= + \ & : (3.51) 


Ge 
and the motion is referred to as a conical pendulum. Since w-e = 1,@'-e/l tor 
an axially symmetric body, we will get a-e ~ 0 when J, is relatively small. In 


any case, the conical pendulum requires that the kinetic energy of rotation 
about the symmetry axis be small. 


Deviations from Steady Precession 


The solution for steady precession which we have just examined is an exact 
solution of the equations of motion, but it is a special solution. Nevertheless, 
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for any total energy there is always a solution with steady precession. It differs 
from other solutions with the same total energy in having the kinetic and 
potential energy as separate constants of motion. Therefore, we can describe 
any solution in terms of its deviation from steady precession. Accordingly, we 
write the solution in the form 


RU (3.52) 
where, 

R= witha, — we, (3.53a) 

R,=e%% et with o,=0,G6, (3.53b) 

w,w0,=-G, (32530) 

+ (w? + w3) = E+ 2G-e. (3.53d) 


The spinor U in (3.52) describes the deviation from steady precession. To 
obtain a differential equation for U, we substitute (3.52) into the equation of 
motion 


R = +(e,RG — (E + 2 G-e)R), 
andeuse (3.554, b,.c.2d), bhus, 
R=R,[(U+ io,U + Uio, —+oUe, - -(w? + w2)UIR, 
= +R fe,.UG — (E + 2 G-e)U] R,. 


Hence, 

U+io,U + Viw,+ G-(e-e,)U =0. (3.54) 
Also, from (3.52) 

e=f'eR = Rie,R,, (3.55a) 
where 

e, = Ute,U. (3.55b) 
Hence, from (3.53b), 

G-e = G-e, = (GUte,U). (3.56) 


This shows that the last term in (3.54) is a function of U only. 


To study small deviations from steady precession, let us introduce Euler 
parameters by writing 


U=ar+if. 657) 


Substituting this into (3.54) and separating bivector and scalar parts we obtain 
the two equations 
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B+ Bx + do, + G-(e-e,)p=0 (3.58a) 

a-B-w, +G-(e-e,)a=0, (3.58b) 
where 

a, =0,+@,. (3.59) 
From (3.55b) we obtain 

e, = Ute,U =e, + 2aB X e, + 2Bae,B, (3.60) 


so from (3.56) we get 
G-(e-e,) = 2aB-(e, X G) + 2(e, X B)(B XG). (3.61) 


This is to be used in (3.58a, b) to express the equations as functions of a 
and £. 
For the small angle approximation we have 
U = e828 =~ 1 + Sie. (3.62) 


So using a = 1 and B = +e in (3.58a, b) and (3.61), we obtain, to first order 
in €, the equations 


é+éexXo =0, (3.63a) 

~é-@, +2 -(e, X G) = 0. (3.63b) 
Equation (3.63a) integrates immediately to 

é=o Xe, (3.64) 


where the integration constant has been set to zero so that (3.63b) is satisfied. 
From (3.59) and (3.53a, b, c) we have 


a Xo, =20,Xo,=-2e,XG, (3.65) 


which enables us to show that (3.63b) follows from (3.64). 
The solution to (3.64) is the rotating vector 


a AE a glial (3.66) 


where &, is a constant vector orthogonal to w . An additive constant vector 
parallel to w has been omitted from (3.66), because its only effect would be 
to change initial conditions which are already taken care of in specifying the 
vector @). 

We now have a complete solution to the equation of motion, and we can 
exhibit it by writing the attitude spinor R in the form 


R = ef V2 Mwit ef /2yer Coe (3.67) 


where € = é(f) 1s the rotating vector given by (3.66). This shows explicitly the 
time dependence of the rotational motion and its decomposition into three 
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simpler motions. As noted before, the first factor in (3.68) describes a 
rotation of the body about its symmetry axis, while the third factor describes a 
steady precession. The second factor describes an oscillation about steady 
precession called nutation. To visualize the motion, we consider the orbit 
e = e(t) of the symmetry axis on the unit sphere. 

To first order, substitution of (3.66) into (3.60) gives us 


e) =e, ee, = €) Hlee =) % e7 (3.68) 


Note that the term € X e, as a linear function which projects ¢ onto a plane 
with normal e, and rotates it through a right angle. Therefore, it projects the 
circle € = e(f) into an ellipse with its major axis in the plane containing e, and 
the normal to the circle w = w,e, — 7,G. Thus, (3.68) describes an ellipse 
e, = e,(t) centered at e, 
and lying in the tangent 
plane to the unit sphere, 
as shown in Figure 3.6. 
The eccentricity of the el- 
lipse depends on_ the 
angle between e, and @_ 
= w,e, — w,G. For a slow 
top @,>,, so @ = 
@,e, and the ellipse is 
nearly circular. 

The orbit e = e(t) of 
the symmetry axis is the 
composite of the elliptical 
motion (3.68) and the 
steady precession, as described by 


Fig. 3.6, First order orbit of the symmetry axis. 


e(t) = He R, = 0?" e (re. (3.69) 
The resulting curve oscillates with angular speed 
70. =[7(E+ eG)” (3.70) 


between two circles on the unit sphere with angular separation 2¢ = 2|e|, as 
indicated in Figure 3.7. We use the term nutation to designate the elliptical 
oscillation about steady precession, though the term ordinarily refers only to 
the vertical ‘nodding’ component of this oscillation. 

To determine the qualitative features of the orbit e = e(t), we look at the 
velocity 


é= R'[e, X (w Xe+,)|R,. Gay 


The nutation velocity e, X (@_ X €) is exactly opposite to the precession 
velocity e, X w, only when the orbit is tangent to the upper bounding circle in 
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Figure 3.6. Therefore, |e| has its minimum values at such points, and é = 0 
only on an orbit for which 


ew, — @,| = |e, X @,]. (3:72) 


This is the condition for the cuspidal orbit in Figure 3.6. A looping orbit 

occurs when ¢|, — ,| > |e, X w,|, and a smooth orbit without loops occurs 

when é|w, — @,| < |e, X @,|. A cuspidal orbit can be achieved in practice by 

releasing the axis of a spinning top from an initial position at rest. Therefore, 

the two other kinds of orbits can be achieved with an initial impetus following 
or opposing the direction of precessional motion. 

Equation (3.67) is an approximate solution of Euler’s equation for the 

Lagrange problem with the same accuracy as the harmonic approximation for 

the motion of a pendu- 

| | lum. In Section 7-4 we 

\2¢ will find the exact sol- 

\_ution in terms of elliptic 

functions. Unfortu- 

nately, the exact sol- 

ution is difficult to 

interpret and awkward 


G G G to use. However, all its 
Fig. 3.7. Nutation of the symmetry axis of a precessing top. qualitative features are 


already displayed in a 
much more convenient form by the approximate solution (3.67). Moreover, in 
many applications the exact solution has little advantage, because of uncertain- 
ties about perturbing forces such as friction. As a rule, therefore, we expect the 
approximate solution to be more valuable than the exact solution. 


Effects of Friction 


Since friction is a contact force, the effect of friction on a spinning body 
depends on the distribution of frictional forces over the surface of the body. 
For a symmetric top spinning about its symmetry axis with its CM at rest, the 
forces of air friction are symmetrical about the axis. Consider the frictional 
forces f and f’ at two symmetrically related points r and r’ as shown in Figure 
3.8. The frictional force is opposite to the velocity of the surface at the point 
of contact and f = — f’ by symmetry. Therefore, the two forces make up a 
couple with torque. 


rXft+r xf =(r-r') Xf «<-o, = -oree. 
It is most important to note the direction of the torque is opposite to that of 


the angular velocity. Also, if the frictional force is proportional to the velocity 
at the point of contact, then it is also proportional to w, = we. The same 


472 Rigid Body Mechanics 


conclusions hold for all other pairs of symmetrically placed points. Therefore, 
the resultant torque due to air friction has the form 


where A is a positive scalar depending on the 
shape of the body, the viscosity and density of 
the air and, to some extend, on @-e for 
reasons discussed in Section 3-5. From Euler’s 
equation | = Iw = F, we obtain 


d 
I ae (e-w) = eT; = — dew. 


For a linear resistive force, A is constant, so 


ee ae wD 
eo=e,a,¢%. (3.74) Fig. 3.8. Air friction on a spinning 


Thus we have exponential decay of the spin be 
we. 

‘For a slow top the effect of air resistance due to its precession will be small 
compared to the effect due to the spin about its symmetry axis. Therefore, the 
main effect of air resistance on a slow top will be simply to slow down its spin 
at a roughly exponential rate. Thus, for a sleeping top, the spin w-e is reduced 
by air friction as well as by friction at the point of support until the condition 
for stability no longer holds. Then it begins to fall so nutation and precession 
set in. As the spin continues to decrease, the amplitude of the nutation 
increases until it is so large that the top falls over. 

When a rapidly spinning top is 
placed on a rough surface, as in- 
dicated in Figure 3.9, a force f of 
sliding friction is exerted at the 
point of contact. The torque ex- 
erted by the frictional force can be 
separated into two parts: 


rXf=r,xf 
ft ON (3.75) 


where r,; = ree andr, = rAee. 
The torque rj, x f simply reduces 
the spin w-e in the manner that ‘Fig. 3.9. The rising top. 

has just been described, and it will 

be comparatively small for small |r ,|. The torque r, X f has the form e X G 
which we have already studied, so we know that it will produce precession 
about —f. Since f lies in a horizontal plane as shown in Figure 3.8, this torque 
will make the symmetry axis precess toward the vertical, and we speak of a 
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rising top. As the top rises, its rotational speed |w| decreases, since kinetic 
energy is converted to potential energy. Once the top is erect it becomes a 
sleeping top, since r, X f vanishes. On the other hand, if the slipping ceases 
before the top is erect, rolling motion sets in. 


Magnetic Spin Resonance 


The spinor formulation of rotational dynamics has important applications to 
atomic physics, since atoms, electrons and nuclei have intrinsic angular 
momenta and magnetic moments. Consider an atom (electron or nucleus) 
with intrinsic angular momentum | and magnetic moment #. According to 
electromagnetic theory, a magnetic field B will exert a torque # X B, so the 
rotational equation of motion for the atom is 


1=pxB, (3.76) 
Atomic theory asserts | and w are related by the ‘constitutive equation” 


where y is a scalar constant called the gyromagnetic ratio. Consequently, the 
equation motion can be written 


l= (-yB) x1. (3.78) 


This implies that I’ is a constant of motion, so the effect of B is to produce a 
time dependent rotation of B. We know that such a rotation is most efficiently 
represented by the equation 


l= UT1LU, (3.79) 


where I, is the initial value of 1. Accordingly, we can replace (3.78) by the 
spinor equation of motion 


= 7Ui(- yB) (3.80) 


subject to the initial condition U(0) = 1. 

The spinor U in (3.80) should not be confused with the attitude spinor for a 
rigid body. According to current atomic theory, the attitude of an atom is not 
observable, so there is no attitude variable in the theory. The angular 
momentum and energy of an atom are observable, so there is a dynamical 
equation for rotational motion in atomic theory. But, in contrast with the 
macroscopic mechanics of rotating bodies, there is no kinematical equation 
for attitude. 

Experimentalists wish to manipulate the magnetic moment g# by applying 
suitable magnetic fields. To see how this can be done, we study the solution of 
(3.80) for particular applied fields. The dynamical spinor equation (3.80) is 
preferred over the vector equation (3.78), because it is easier to solve. For a 
static field B = B,, the solution of (3.80) is simply 
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U = e-(aviveot | (81) 


This tells us that | and mw precess about the magnetic field with an angular 
frequency — yB,. 

Now suppose we introduce a circularly polarized monochromatic plane 
wave propagating along the direction of the established static magnetic field 
B,. At the site of the atom, the magnetic field of such a wave is a rotating 
vector 


b(t) = b,e’, (3.82) 


where b, and » are constant vectors for which wB, = B,@ and b,w = —wby,. 
The resultant magnetic field acting on the atom is therefore 


BB. b,c = SB by) S, (3.83) 
where 

SS erst (3.84) 
The form of (3.83) suggests that we should write U in the factored form 

U=RS. (3,85) 


Then, the spinor equation of motion gives us 
U =- 5 UiyB = -+Riy(B, + b,)S = (R + +Rio)S. 


Hence R obeys the equation 


R=-+Riy(B, + y'w + b,). (3.86) 
This has the solution 

Rao ees, (3.87) 
where 

B'=B,+ y'o+ b. (3.88) 
The motion of | is therefore completely described by the spinor 

US eee (3.89) 


This tells us that the motion is a composite of two precessions with constant 
angular velocities. 

The experimentalist can tune the frequency of the electromagnetic wave 
until the condition, w = yB,, for magnetic resonance is achieved. Under this 
condition, (3.88) gives B’ = yb,, which is orthogonal to w and B,. Then 
(3.89) tells us that ¢ = ylis precessing with angular velocity —yb, orthogonal 
to B, in a frame which is precessing about B, with angular speed w. Since 
w = yB, > w' = yb, typically, the composite motion will be a steady spiral 
motion, as illustrated in Figure 3.10. If lis initially aligned with B, when the 
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electromagnetic _radiation _is 
turned on, then its direction will 
be reversed in a time T = 2z/yb,. 
Consequently, a single “‘spin flip” 
can be produced by a pulse of 
duration 7 at resonance. 

Since the gyromagnetic ratio y 
has different values for different 
types of atoms, the radiation field 
can be tuned to detect specific 
atomic types by magnetic reson- 
ance even when they are buried in 
complex biological materials. For 
this reason, magnetic resonance is 
widely used for chemical analysis. 


Fig. 3.10. Simultaneous precession of the mag- 
netic moment p about the fields B’ and B, at reson- 7-3. Exercises 


ance. 


(3.1) 


(32) 


(3.3) 


(3.4) 


(3.5) 


Let @ be the angle between the angular momentum I and the 
symmetry axis of a freely spinning axially symmetric satellite. Prove 
that 


where E = +1-@ is the kinetic energy. 

The free Eulerian wobble of a satellite produces periodic internal 
stresses leading to energy dissipation without significantly altering 
the angular momentum. Show, therefore, that the wobble tends to 
damp out if the satellite is oblate. 

A heavy homogeneous right circular cone spins with its vertex fixed. 
The axis of the cone is 10 cm. long, and the radius of its base is 5 cm. 
The cone processes steadily with a period of 4 sec. How many 
revolutions per second does the cone make about its own axis? 

A homogeneous circular disk of radius r spins on a smooth table 
about a vertical diameter. Prove that the motion is stable if the 
angular speed exceeds 2Vg /r. 

When w-e = 0), Equation (3.69) describes the orbit of a so-called 
spherical pendulum. Sketch the orbit for appropriate choices of the 
free parameters. Show that the orbit of a spherical pendulum cannot 
have loops or cusps such as those in Figure 3.5. 

A spinning top is held fixed in an upright position and suddenly 
released. Describe the motion if the condition for stability of a 
sleeping top is not satisfied. 
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(3.6) The tippie-top is a child’s toy consisting of a decapitated ball with a 
short stem as shown in Figure 3.11. If the tippie-top is set on a table, 
stem upward, with sufficient spin about its stem, it will do a com- 
plete flip-flop ending up in the steady precession standing on its 
stem. Similarly, if a hard boiled egg lay- 
ing on a table is given sufficient spin about 
a vertical axis, it rises to steady precession 
standing on its narrow end. Provide a 
qualitative explanation for such behavior. 
Assuming that the ball is hollow, show 
that the “flip time” for the tippie-top is 
approximately 27rqw/3ug. (The behavior 
of the tippie-top has been discussed in the 
American Journal of Physics on several 
occasions). 


Fig. 3.11. The tippie-top. 


7-4. Integrable Cases of Rotational Motion 


When the general solution to a system of differential equations can be 
expressed in closed form, we say that the system is integrable. A solution in 
closed form is expressed in terms of a finite number of known functions (in 
contrast to an infinite series). Such a solution is sometimes said to be exact, as 
opposed to an approximate solution obtained by truncating an infinite series. 

The rotational equations of motion are integrable in only a few simple 
cases, in particular, the plane pendulum, the symmetric top in a gravitational 
field, and a freely spinning asymmetric body. The general solutions in all 
these cases are expressible in terms of elliptic functions, and we shall find 
them in this section. 


Constants of Motion for the Lagrange Problem 


The problem of integrating the equations of motion for a symmetric top 
subject to a gravitational torque is called the Lagrange problem. In Section 
7-3 we reduced the equations of motion for a symmetric top to those for a 
spherical top. So we begin here with the reduced equations. 


R = 7 Rio, (4.1) 

o=eXG, (4.2) 
where 

e=Rto,R (4.3) 


and 
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rF 
= (4.4) 
for a constant force (eq. F = mg) applied at the point re. 

In Section 7-3 we combined (4.1) and (4.2) into a single second order 
differential equation, but to find the exact solution it is more efficient to 
exploit the constants of motion directly. As we have noted before, equation 
(4-2) admits three integrals of motion, namely, 


e-w =-2io, RR’), (4.5) 
G-w = -2¢iG R'R), (4.6) 

and the energy integral 
= tw —e-G = 2|R?- (Rte, RG), (4.7) 


By finding these three integrals of the motion, we have effectively integrated 
the dynamical equation (4.2), leaving us with three first order scalar equations 
to be integrated for R. 


The Compound Pendulum 


The motion of a pendulum is a special case of the motion of a symmetric top. 
A compound pendulum is a rigid body free to rotate about a fixed horizontal 
axis under the influence of gravity (Figure 4.1). If / is the moment of inertia 
for the axis, then the equations of motion for the pendulum are exactly as 
specified above. 

-For a pendulum 


eao=G-w=0. (4.8) 


Therefore, the motion of the pendulum is 
completely determined by the energy in- 
tegral (4.7). Equations (4.8) allow us to 
parametrize the attitude by 


R=atiop, (4.9) 


where a and f are scalars and a,, is the 
direction of the fixed rotation axis. If R is 
to specify the deviation from the down- 
ward vertical direction 6, = G, then it is 
related to the angle of deviation 6 by 


Fig. 4.1. A compound pendulum. 


R=cos+6+ ie, sins @ = e719, (4.10) 


However, the motion is more simply described in terms of the parameters a 
and £ instead of @, as we see below. 
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Substituting (4.9) into the energy integral (4.7) and using a’ + B* = | to 
eliminate a, we obtain 


B? =4+(E + G-2GB”)(1- B’). (4.11) 


With a suitable identification of constants, this can be put in the standard 
form of Equation (B.2) in Appendix B; so the solution can be expressed in 
terms of elliptic functions. There are three cases to be considered, depending 
on the value of 


E=12-G, (4.12) 


where + w? is the maximum kinetic energy of the pendulum. 
(a) When E < G, let k? = wi/4G < 1, so (4.11) assumes the form 


6? = G(k? - B’)(1 - B’). 


With the change of variables 6 = ky and x = G't, this equation becomes 
identical with (B.2), so it has the solution 


B=ksnx=ksn(G'’). 
Since ao? + B? = 1, we have from (B.9) 
a= dn(G"’Z). 
Also, we can write 
E = G(2k? - 1) = -G(R)), = -G(1 - 285), 
where fi, = sin +4, is the value of f at the angle 6, of greatest deflection from 
the vertical. Thus, the spinor solution to the equations of motion can be given 
the explicit form 
R = dn(G"’t) + io, By sn(G"t), (4.13) 
where f, = sin + 6, = k is the modulus of the elliptic function. 
Comparing (4.13) with (B.7) we can conclude that the period of motion is 
Ta aie (4.14) 


When 6, ~ 0, then k ~ 0 and K ~ 77, so T ~ T, = 27G~™”, the period we 
found in (3.34) for small oscillations. The exact value of the period for a 
pendulum depends on its amplitude, as shown in Table 4.1. 

(b) When E > G let k* = 4G/w5 < 1, so (4.12) gives us E + G = 2 G/k’, 
and (3.34) becomes 


p= (1 kp?) (1-6). 


Hence, 
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TABLE 4.1. Dependence of the period of a pendulum on its amplitude. 


8, k = By = sin 5% IK TIT, = 2Kinx 
30° sin 15° = (V3 - 1)(2V2) 1.598 1.02 

60° sin 30° = > 1.686 1.07 

90° sin 45° = 1/V2 1.854 ints 

120° sin 60° = + V3 2.156 1,37 

150° sin 75° = (V3 + 1)(2V2) 2.768 1.76 

180° sin 90° = 1 oo oc 

NiO) 
G" > Git 
R=cn | k + io, sn Taga (4.15) 


This solution describes a pendulum which makes complete revolutions with 
period 


T= Gz = ae c (4.16) 
This is twice the period of R, because e = RtGR = GR’, so the period of the 
pendulum motion is the period of R’. In the preceding case, the periods of R 
and R* were the same. 
(c) When E = G, we have k*? = w;/4G = 1, and (4.11) becomes 


B = GU- By. 
This has the solution 6 = tanh(G'’f), so 
R = sech(G"t) + io, tanh(G"t). (4.17) 


The pendulum never quite reaches the upward vertical position. 


Solution of the Lagrange Problem 


We have reduced the Lagrange problem to solving the three integrals of 
motion (4.5), (4.6) and (4.7). The next step is to identify the parameters 
which provide the simplest description of the motion, as we did in the special 
case of the pendulum. To do that we note that the vector G introduces a 
preferred direction in the problem, the “downward vertical’’. For an erect 
top, we are interested in deviations of the top’s symmetry axis from the 
upward vertical, so we specify o, = ~G. For a hanging top, as in the case of 
the pendulum, 6, = G would be more appropriate. Both cases are taken care 
of, respectively, by writing 


G=-G-e,=+|G|. (4.18) 


Now the factor 6, RG = —Ga,Ra, in (4.7) suggests that we might simplify the 
energy equation by writing 
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R=a,-i¢,a_ (4.19) 
where a@, and @ are quaternions which commute with o,; for then 
Onog= a, +16,0 (4.20) 


The commutivity with o, implies that we can write 


a.=1, c=. (4.21) 
where A.. and @, are scalars. Of course, 

Rt = al + atio,, (4.22) 
so the parameters are related by 

ROR a, + lal? =A2 +127 =1. (4.23) 


The variables a, are called the Cayley-Klein parameters in the scientific litera- 
ture. However, our formulation identifies the imaginary unit in these parame- 
ters as the specific bivector io,. This enables us to see exactly when and why 
the Cayley-Klein parameters are useful, namely, in rotational problems 
where a preferred direction 6, is specified. Readers who are familiar with 
advanced quantum mechanics will be interested to note that our decomposi- 
tion of the spinor R into Cayley-Klein parameters corresponds exactly to the 
standard decomposition of an electron wave function into “spin up” and 
“spin down” amplitudes. 

To express the integrals of motion in terms of the Cayley-Klein parameters, 
note that 


CO, = OG, (4.24) 
This helps us compute 

RRt = ata, + aat + ie(ata - ata) 

RtR = ata, + ata + io,(a,at —- a,at) 
Also we find 

ata, =A,A, + isA2o,, (4.25) 


with a similar expression for a’@,. Now the integrals of motion (4.5) and (4.6) 
can be put in the form 


Ro, -RO. = yew 
26, + 2o =+6-0 
Or equivalently, 
Ro, =to(Gte=y, (4.26a) 


2p =+@(G-e)=y. (4.26b) 
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It will be noted that these expressions are analogous to angular momentum 
conservation for central force motion. They enable us to calculate @. by 
straightforward integration when 4., is known. 

To evaluate the energy integral we calculate 


o,e = o,Ra,R = |a,|? — |a_|? - 2io,a,a._. 


So with the help of (4.23), we find 


o,e = +(2/a,|?—1) = +(2A2,-1). (4.27) 
For 6,-e = cos 8@, this tells us that 

A, =cos+@ and A. =sin;@. (4.28) 
Using (4.27) in the energy integral (4.7), we get 

E = 2(\@ |? + |@ |?) + GQ2|a,|?-1) (4.29) 
From (4.25) and (4.26a, b) we find 

la, |? =A2 + y, 22. (4.30) 
Substituting this into (4.29) and using (4.23) to eliminate A_, we obtain 

A242 = GAS -4+(E + 3G)At + [F(E + G) + weld? -y, (4.31) 


This differential equation can be solved for A,, and then A can be obtained 
from (4.23). 

According to Appendix B, Equation (4.23) has a closed solution in terms of 
elliptic functions given by 


AX =asm ut+b, (4.32) 


where a and bd are constants. This tells us that A*, is a periodic function of time 
with maximum and minimum values satisfying 


0<b<1l, (4.33a) 

O<a+b<l. (4.33b) 
Thus, 4. = cos +4 oscillates symmetrically about the value 

cos 76, = |b + zal". (4.34) 
Comparing (4.31) with (B12) and (B.16a, b, c, d) in Appendix B, we find 
that b is determined by the cubic equation 

Gb?-+(E + 3G)b’ + [+(E + G) + welb-y, =0 (4.35) 


and (4.33a) tells us how to identify the “physical root’’. Also, after b is known 
we can get a by solving the quadratic equation 


2Ga? + [G(2b - 3) — Ela + E(1 - 2b) + G(6b’ - 6b + 1) + 2w-e = 0, 
(4.36) 
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and (4.33a, b) tells us that the ‘physical root” must satisfy |a| < 1. Finally the 
“time constant” 4 in the solution is obtained from a and b by using 


uw? =+(3G + E)-G(3b + a) (4.37) 
and the modulus k? of the elliptic function is given by 
G 
P= - (4.38) 


Thus, we have determined all the constants in the solution (4.26a) for A,. 
Using (4.32) we solve (4.26a) for the angle ¢,: 


p(t) = ¥, | we 


0 b+ asn(ut) 


This can be put in the standard form 


= 1: od 4.39 
0.0) = Fe mune), (4.39) 
where 
T dt 
1 | 1 ee aa 


is a standard function known as the incomplete elliptic integral of the third 
kind. By the same method we find 


a if a 
#0 = aay tes] a) 


A2=1-b-asn’ pt. (4.42) 


Numerical values for the elliptic integrals as well as the elliptic functions can 
be found in standard tables, but nowadays it is much more convenient to get 
the results by computer calculation. 

We now have a complete closed solution of the Lagrange problem. Unfor- 
tunately, our solution is not easy to interpret. A picture of the motion can be 
obtained by computer simulation, or, more laboriously, by further math- 
ematical analysis of the solution. But we will not pursue the matter further, 
since we already have a clear picture from our approximate solution in 
Section 7-3. 


Freely Spinning Asymmetric Body 


We turn now to the problem of finding an analytic description for the motion 
of an arbitrary, freely spinning rigid body. The dynamical equation of motion 
for the body is 
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i=So+ox1=0. (4.43) 


Note that because of the vanishing torque, this equation is not coupled to the 
kinetic equation 


R = +Rio. (4.44) 


Therefore, it can be solved directly for @ = w(t), and the result can be 
inserted into (4.44) to determine the attitude spinor R. From (4.43) we can 
conclude immediately that the angular momentum | = %qw and the energy 

= +a: are constants of motion. So we want to solve the differential 
equation (4.43) for w in terms of | and E. 

Before attempting a general solution, let us consider the possibility of 
steady rotational motion about a fixed axis. In that case @ = 0 and (4.43) 
implies that w X 1 = 0, so w must be collinear with |. Since 1 = %a, this is 
possible only if w is directed along one of the principal axes. Moreover, if 
ow X 1 # 0, then (4.43) implies that w + 0. Therefore, in the absence of an 
applied torque, steady rotational motion is possible if and only if the axis of 
rotation coincides with a principal axis of the inertia tensor. We will ascertain 
conditions for the stability of steady rotation later on. 

Returning to the general problem, we follow (1.17) and put the equation of 
motion (4.43) in the form 


1,@, = (1,-1,)o,0, (4.45a) 

Lo, = (1,-1)o,0, (4.45b) 

1,0; = (I, -1,)@,o,, (4.45c) 
where the. 

w, = we, = IT l-e, (4.46) 


are components of the rotational velocity with respect to the principal axes. 
Since we have already analyzed the motion of a symmetric body, we assume 
that all the principle moments of inertia /, have different values. 

The three variables w,, w,, w, are related by the integrals of motion 


Poo a= Pw? + Bo; + hw; (4.47a) 
2E = wo: Iw=I,w; + Lw; + Lw;. (4.47b) 


Therefore we should be able to eliminate two of the variables from (4.44a, b, 
c) to get an equation for the third alone. To this end, it is convenient to 
eliminate each variable in turn from (4.47a, b) to get 


?-2EI, = 1,(1,-1,)e2 + 1,(,-1,)0? (4.48a) 
?-2EL = 1,(1,-1,)o? + L(1,-1,)0 (4.48b) 
?-2EI, = 1,(1,-1,)? + L(,- 1) 02. (4.48c) 
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To obtain an equation for w, alone, we square (4.45c) and use (4.48a, b) to 
eliminate w, and w;; thus, 


fo, = (1, - Ly oa; 


= (1,—1):| CER= = bG, - Ne} | (? - 2E1,) - 1. - 1)o% 
: LU 1,) 1,1. -1,) 
(4.49) 

Comparing this with the equation 

Vom etl ye). (lV tkey 2) (4.50) 
we see that it will have a solution of the form 

W, = @,Sn ut (4.51) 
provided 

| ey Be oe Cs (4.52) 
and 

ZEL—= Ve, Ze, 0. (4.53) 


Of course we are free to label the /, so (4.52) holds, and we notice that the 
inequalities (4.53) are then consequences of (4.48a, b). Therefore, the con- 
ditions for the solution (4.51) are satisfied; and the constants can be evaluated 
by inserting (4.59) into (4.49) and comparing the results with (4.49). At the 
same time, by comparing the two lines of (4.49) we can determine w, and w,. 
There are two cases, for which we find 


Case (a): 

@,=a,dn(ut), w,=a,cn(pt). (4.54a) 
Case (b): 

@, =a, cn(ut), w,=a,dn(ut). (4.54b) 


For both cases we get 


s. QHREP «|, _ P2ORp 


C=) eee ee 2 4.54 
T(r ty) L(Y, - f) ( ) 
For case (a), 
pe i? = 2EF. ee i, =1 EL — 1) 
IGE = Ts) ; IRAE , 
2 I,-1 2EL —[° 
k? —_ 3S 1 Be 
(7 =a | [? — 2EI, 02) 


and for case (b) 
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Apes Zhi 2 SUE) (S2ET)) 


to) an 
2 I,-1, \{?-2EI 
ke = 2 $3 CT eee 
| L,-1, | oe (4.56b) 


The quantities u, k and a, can be taken to be positive. To determine the signs 
of a, and a,, we substitute the solutions into (4.45a, b, c), and after carrying 
out the differentiation (see Appendix B), we obtain 


tals 
(/, ee I) te a I) eis a i) 
Therefore, if we take a, > 0, then a, < 0. Finally, note that the two cases are 
distinguished by the requirement that the expression for k* in (4.56a) or 


(4.56b) must satisfy k* < 1. By substitution into (4.48a), we see that this 
requirement can be reexpressed as the condition that 


l’-2EI, > 0 for case (a) (4.58a) 
l?-2EI, < 0 for case (b). (4.58b) 


This completes our solution of the dynamical equation of motion. 
The problem remains to determine the attitude spinor R from the known 
functions w, = w,(t). We could proceed by integrating 


<0. (4.57) 


a,a,a, = 


R =1Rio = +i(,6, + w,6, + ,6,)R, 


but there is a much simpler way which exploits the constants of motion and 
determines R almost completely by algebraic means. The angular momentum 
direction cosines 


ip = i-e, =e I'l, @, (4.59) 
are more convenient parameters than the w,, because then we can write 
i =he, + h,e, + hye, = Rt(h,o, + h,o, + h,o,)R = 6,, (4.60) 


where we have used our prerogative to identify o, with the distinguished 
direction | in our problem. Note that with this choice 


h, = 6,e, = V'l,w, = a, sn t. (4.61) 


The question is now, what does (4.60) tell us about the functional form of R? 
As before, the fact that o, is a distinguished direction suggests it may be 
convenient to express R in terms of Cayley-Klein parameters: 


R= a, —10,0., (4.62) 


a, = A. e839 + : (4.63) 


Using this parametrization for R, from (4.60) we obtain 
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h,o, + ho, + ho, = Ro,Rt = (A2-22)o, + 26,a'a.. (4.64) 
Hence, 

h,=v-R# . (4.65) 
and 

h, + io,h, = 2a_al = 21,A_e* 9%, (4.66) 
Since 


R'R = 4 + Az= 1, 
from (4.65) we obtain 


i= (+ ‘ (4.67) 
And (4.66) give us 
po, = tai (4 , (4.68) 


Thus, we have determined A, and @ — @¢, as functions of the h,, so we can 
complete our solution by determining @, + @_. That requires an integration. 

Instead of determining ¢, + @_ directly, it is more convenient to determine 
the variable 


p=, + 26_-4, (4.69) 
which, as established in Exercise (4.6), is one of the Euler angles. Using 
@ = -2iR'*R, 


we can express w in terms of the Cayley-Klein parameters or the Euler angles 
and their derivatives (Exercise 4.5). 
whence we obtain 


o+ho, = o-o, = = , 
hp a o. = wre, =, 


Eliminating @, , we get 


. ; . l 7a Boe i 1 
=o, 20. | or 
eae gr an 
‘This integrates to 
2EI,-I° 
#11) = (6) + (t= 6) + FER" tae, at) — (0. a), (4.71) 
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where (tT, a’) is the incomplete elliptic integral of the third kind defined by 
(4.40). 

Although we now have a complete and exact analytic solution to our 
problem, the solution does not immediately provide a clear picture of the 
body’s motion. For that purpose, we now look at the problem in a different 
way. 


Poinsot’s Construction 


To develop a picture for the motion of a freely spinning asymmetric body, 
consider the restriction on the rotational velocity w due to the energy integral: 


+@:($w) = E. (4.72) 


This is the equation for an ellipsoid; call it the energy ellipsoid. Thus, for a 
given rotational kinetic energy E, w must be a point on this ellipsoid. Note 
that the principal axes of the energy ellipsoid are the same as those of the 
inertia tensor. So the attitude of the energy ellipsoid in space faithfully 
represents the attitude of the body itself. 

The normal to the energy ellipsoid at a point @ is given by the gradient 


V., (Go: Iw) = Jo=l. (4.73) 


For a constant angular momentum I, this puts a further restriction on the 
allowed values of w. Indeed, for fixed | and variable w, the energy integral 


wl =2E (4.74) 


is the equation for a plane with normal | and distance 2£// from the origin. 
This plane is called the invariable plane. Therefore, for given £ and I, at any 
time ¢, the invariable plane is tangent to the energy ellipsoid at the point 
@ = o(t). (Figure 4.2) Moreover the energy ellipsoid can be said to roll on 
the invariable plane without slipping, since the point of contact is on the 
instantaneous rotation axis. 
This picture of the motion is 
due to Poinsot (1834). An ap- 
paratus that shows subtle fea- 
tures of the motion has been 
described by Harter and Kim 
(Amer. J Phys. 44, 1080 
(1976)). 

As w(t) varies with time, it 
traces out a curve on the energy 
ellipsoid called the polhode and 
a curve on the invariable plane 


Fig. 4.2. The invariable plane is tangent to the energy called the herpolhode. We have 
ellipsoid at w(7). ' already found a parametric 
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ready found a parametric equation for the polhode, given by (4.51) and (4.54a, 
b). But a nonparametric representation makes it easier to picture the curve as a 
whole. This is easily found by noting that angular momentum conservation 
implies that w must lie on the ellipsoid 


P= 0( So). (4.75) 


Therefore, the polhode is a curve of intersection of the two ellipsoids (4.72) 
and (4.75). This tells us at once that a polhode is a closed curve, so the motion 
is periodic. Polhodes for various initial conditions are shown in Figure 4.3. A 
polhode can be interpreted as the curve traced out by the tip of the w-vector 
as “‘seen’”’ by an observer rotating with the body. 


Separating polhodes 
f° = 2EI, 


Fig. 4.3. The energy ellipsoid, showing polhodes for different initial conditions (J, < /, < 5). 


Questions about the stability of rotational motion are best answered by 
examining the family of polhodes with different initial conditions. We have 
already proved that steady rotation is possible only about a principal axis. To 
investigate the stability of steady rotation quantitatively, we consider a small 
departure from the steady motion by writing 


o=@,+ &, (4.76a) 
where 

Iu, = [,0, (4.76b) 
indicates that @, is directed along the e, principal axis, and 

EW, = 0 (4.76c) 


must be satished for a small deviation ¢. The plan now is to get an equation 
for €, so we can study its behavior. Substituting (4.70a) into (4.72) and (4.75) 
and using (4.76b), we get 
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2E = Io, + 21,8°'@, + € Ke 
1? = 1,?@2 + 213 ea, + & Fe. 


Using the approximation (4.76c) and eliminating w;, from these two equa- 
tions, we get the desired equation for ¢ 


P21, E = e-(#-1,9)e. (4.77) 


This is, in fact, an exact equation for the projection of the polhode onto a 
plane with normal e,. If we decompose ¢ into its components €, = &-e, with 
respect to the principal axes, this equation can be written 


Ld,-Le+ L,- Le = ?-21,E. (4.78) 


This will be recognized as the equation for an ellipse if /, is greater than or less 
than /, and /,. This means @ will stay close to w, during the entire motion. 
Therefore, steady rotation about the two axes with the largest and smallest 
moment of inertia is stable. 

On the other hand, if the value of /, is between the values of /, and /,, then 
(4.78) is the equation for a hyperbola. So if ¢ has any small value initially, it 
will increase with time, and @ will wander away from @,. Therefore, steady 
rotation about the intermediate principal 
axis is unstable. This can be seen by ex- 
amining the polhodes in Figure 4.3. And it 
can be empirically demonstrated by at- 
tempting to throw an asymmetric object like 
a tennis racket up in the air so that it spins 
about a principal axis. 

As the energy ellipsoid rolls on the invari- 
able plane, the polhode rolls on the herpol- 
hode. In contrast to the polhode, the 
herpolhode is not necessarily a closed curve, 
but as shown in Figure 4.4, it must oscillate 
between maximum and minimum values 
corresponding to maxima and minima of the 
polhode. 


Fig. 4.4. The herpolhode is confined 
to an annulus in the invariable plane. 


7-4. Exercises 


(4.1) Verify the expression (4.5), (4.6) and (4.7) for the constants of 
motion in terms of R and R. 

(4.2) For compound pendulum, show that the frequencies of oscillation 
about two different parallel axes will be the same if and only if 
rr’ = r2, where rand’ are the distances of the points from the CM, 
and r, is the CM radius of gyration. Show also that the oscillation 
frequency is that of a simple pendulum with length r + r’. 
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(4.3) 


(4.4) 


(4.5) 
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Show that the elliptic integral (B.6) in Appendix B can be written in 
the form: 


- ci do 
K = I. (1 — k? sin? @)'” 


Expand the integrand in a series and perform term-by-term inte- 
gration to get the following expression for the period of a plane 
pendulum: 
2n Ly | 1-3 \’ | 
T= —>f]1t+{—] kA? 4+{——| k*t...]. 
gel +(a) #+(5a) 


Thus, show that the first order correction for the period in the small 
angle approximation gives: 


where 6, is the angular amplitude. 

A plane pendulum beats seconds when swinging through an angle of 
6°. If the angle is increased to 8°, show that it will lose approxi- 
mately 10 beats a day. (See Table 4.1). 

In the extensive literature on the Lagrange problem, the motion is 
usually parametrized by Euler angles. To compare the literature to 
the approach taken here, recall that the parametrization of a ro- 
tation by Euler angles w, 0, @ is given by: 


— 1/2)16: 1/2), 6 1/2)ia3 
R=eée' NOP of EAI) oS Meee 


where o, and o, are orthogonal constant unit vectors. For a rotating 
body subject to a constant effective force G, we take 6, = G and 
e = R's,R = R'GR as the axis of symmetry. Then @ is called the 
precession angle, @ is called the nutation angle, and yw is called the 
phase angle. Show that the rotational velocity in terms of Euler 
angles is given by: 


wo = -2iR'R = Go + G & ef + ey, 


where, 


G Xe 


xXxe= ———_. .. 
coe IG Xe 


Show that the constants of motion a=@-e, b=awG and E = 

+’ — Ge yield the equations 

a+ bcos @ 
sin’ 8 


Vy = 
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(4.6) 


(4.7) 


(4.8) 


. b + acos @ 

oan?” 

. 2 

pare (ins (ie +2Gcos@ = 2E-a’. 


sin’ 8 


The solution of thege equations is discussed by many authors. Use 
our results to express cos @ in terms of elliptic functions and show 
that it is a solution to the last of these equations. 

Show that the Cayley-Klein parameters are related to the Euler 
angles by 


a, = cos +6 (20019 +O) Gg = io, sin +@ ef 208K — W) 
To establish the converse relations, show that 


R= e(2viB,0+ (A ee ia,A ) elV2io(ps + 2p. - 7) 
rs im 9 


Whence, 
prin, 
g= ¢, + 26-7. 


For the case of steady forced precession, compare the solution in 
terms of elliptic functions to the exact solution obtained in Section 
738 

Show that when the R is expressed in terms of Cayley-Klein para- 
meters a, , the spinor equation (3.24) can be separated into two 
uncoupled second order equations: 


a, + (L(E + 3G) F 2G |a,/"Ja, = 0 


Of course, since G is a parameter which can have either sign, the 
two equations are essentially the same. To solve this equation 
directly, it is helpful to rewrite it by defining r = 6,a,, which is a 
vector because of Equation (4.21). In terms of this variable the 
equation becomes 


it = [}(E + 3G) -2Gr'Jr = 0, 


which we recognize as the equation for a particle in an unusual 
central force field. Use this fact to get Equations (4.26a) and (4.31) 
directly as first integrals of the motion with undetermined constants. 
Note that if we use Equation (4.32) we can put the above equation 
in the form 


a+ [A+ Bsr’ ut la =0, 


where A and B are scalar constants. This is called Lamé’s equation. 
Although we have solved this equation for the case of the top, that 
does not end the matter, because the form of our solution is 
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probably not optimal. We derived separate expressions for the 
modulus and angle of a, whereas there are alternative expressions 
for a as a unit which can probably be calculated more easily. This is 
a worthy issue for mathematical research. Evaluation of the sol- 
utions is discussed by E. T. Whitaker (A Treatise on the Analytical 
Dynamics of Particles and Rigid Bodies, Dover, N.Y., 4th Ed. 
(1944), especially p. 161). Lamé’s equation is discussed by Whitaker 
and Watson (Modern Analysis, Cambridge U. Press (1952), Chap- 
ter 23). No doubt there are significant improvements yet to be made 
in the theory of the top, and we can expect new insights from 
bringing together spinor theory and the classical theory of elliptic 
functions. 


7-5. Rolling Motion 


The mathematical description of rolling motion requires both translational 
and rotational equations of motions coupled by a rolling constraint. Consider 
a centrosymmetric sphere of radius a, mass m, and 
moment of inertia J = mk? rolling on a rough 
surface with unit normal n at the point of contact. 
Let f denote the “reaction force” exerted by the 
constraining surface on the sphere (Figure SE), 

The translational and rotational equations of 
motion for the sphere are 


mg 
yi Fig. 5.1. Forces on a sphere 
TN Satie! (5.1) rolling on a surface with unit 
mk’*@ = (-an) X f, (5.2) normal n. 


where V = X is the center of mass velocity. These equations apply to a sphere 
rolling on an arbitrary surface even if the surface is moving, provided the 
surface is mathematically prescribed, so the normal n is a known function of 
position and time. Of course, we are interested here only in continuous 
surfaces with a unique normal at every point. 

The velocity v of the point on the sphere which is instantaneously in contact 
with the surface is determined by the kinematical relation 


v=V+ @™X (-an). (5.3) 


Suppose the constraining surface is moving with a velocity u at the point of 
contact. The relative velocity v - u must vanish if the sphere is not slipping. 
Therefore, the equation of constraint for rolling contact is 


u= V-aw Xn. (5.4) 


The velocity u will be known if the motion of the constraining surface is 
prescribed. 
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Now we have sufficient equations to determine rolling motion of the sphere 
on a given surface. A general strategy for solving these equations is to 
eliminate the reaction force between (5.1) and (5.2) to get 


k?@ = an X (g-V). 2) 


Then (5.4) can be used to get separate equations for w and V. However, it 
must be remembered that (5.5) does not hold when f = 0, that is, when the 
sphere loses contact with the constraining surface. 

This is as far as we can go with the theory of rolling motion without 
assumptions about the constraining surface. So we turn now to consider 
special cases. 


Rolling on an inclined plane 


For a fixed inclined plane, the normal n is constant and the equation (5.4) for 
rolling contact reduces to 


V =awXn=-iawan (5.6) 
From (5.2) we find that n-w = 0; hence the spin about the normal 
s=on (5.7) 


is a constant of the motion. We can combine (5.6) and (5.7) to solve for w. 
Thus, 


V + ias = —aion, 


sO 
w= (s+ \n=sn4 PX, (5.8) 
a a 
Now we substitute this into (5.5) to get an equation for V alone: 
ia . 
(1 +E) nav = nag. (5.9) 


The outer product is more convenient than the cross product form of this 
equation, because the condition n-V = 0 from (5.6) tells us that we can divide 
by n to get 


(1 + Ev = mae) = (5.10) 


Thus, the sphere rolls in the plane with a constant acceleration, which has the 
value (5/7)g, if the sphere is a homogenous solid (k* = 2a°*/5). The trajectory 
is therefore a parabola 


Se = git” ne (5.11) 
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for initial position X, and velocity V,. 
Substituting (5.11) into (5.8), we find the explicit time dependence of the 
rotational velocity 


o=(2axg}r+ (sn + aXe) . (5.12) 
7a a 


The attitude R = R(t) can be determined from this by integrating R = + Rio. 
However, the integration is not trivial unless the sphere starts from rest, since 
the direction of w is not constant. 


Rolling in a spherical bowl 


For a sphere rolling inside a fixed spherical container of radius b, we can write 
X = (a— b)n (Figure 5.2). Therefore the unit normal n is a natural position 
variable, and the rolling constraint can be written 
V=(a-b)n=ao0Xn. (5.13) 
This implies that @-n = 0, and (5.2) implies that 
n-@ = 0. Therefore, in this case also 
s=on (5.14) 


is a constant of motion. 


Equation (5.9) subject to (5.13) is nearly the same a) 


as the equation of motion for a spherical top, so our 

experience with the top suggests that the best strategy 

is to look at once for constants of the motion. Using Fig. 5.2. Sphere rolling in 
(5.13) to eliminate V, we can put (5.9) in the form a sphere. 


C [ke +a’nX(mXn)] =anXg. 715) 


The quantity is square brackets here can be identified as the angular momen- 
tum of the sphere (per unit mass) about the point of contact. 
As in the case of the top, from (5.15) we find that 


g:[k’w + a’n X (w Xn) = (k? + a’) g-w sagen (5.16) 


is a constant of motion. We can get one other constant of motion from (5.15), 
namely the total energy, which we can also write down from first principles. 
The kinetic energy is 


>mV?+41o? == Ie + a’)\(o Xn) + dase| 


The potential energy is 
-mX-g = m(b-—a)mg. 
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So the effective energy constant is 
E=+(k?+a@)(w Xn) + (b-a)mg. (5.17) 


The three integrals of motion (5.14), (5.16) and (5.17) can be expressed as 
equations for the attitude spinor R by writing 


=a ek (5.18) 


and using R = +Riw. Clearly, a solution in terms of elliptic functions can be 
found by introducing Cayley-Klein parameters in the same way that we 
handled the top. 

When n-g > 0, it is possible for the sphere to lose contact with the 
container. For contact to be maintained, the normal component of the 
reaction force must be positive. Using (5.1), this condition is expressed by 


n-f = mn-(V -g)20. 
With (5.13), this contact condition can be put in the form 

V?=a"(w Xn) 2>(b-a)mg. (5.19) 
It can be further reduced by using the energy equation (5.17). 


Rolling inside a cylinder 


For a sphere rolling inside a fixed vertical cylinder 
of radius b, it is convenient to represent the 
upward vertical direction by o, = -g and para- 
metrize the inward normal n of the constraining 
cylinder in terms of an angle 6 by writing 


n = 0,¢'%9 (5.20) 
Then an explicit parametrization X = X(6, z) of 
the center of mass is given by 


X = (a—b)n + zo, (5.21) 


a4 (Figure 5.3). 
To get suitable equations for the parameters, first 


note that (5.8) holds for rolling motion on any fixed 
Fig. 5.3. Sphere rolling in- surface even when 5 = @-n is not constant. Dif- 
side a vertical circular cylinder. ferentiating (5.8) we get 


ad =axVtnX V+ ast ain, 
and substituting this into (5.5), we obtain 


2 2 ‘ 2 2 
i ad a ean + San) on G22) 
a’ a’ a 
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Differentiation of (5.20) and (5.21) yields 
n= ino,0 = 60, xn 
= (a—b)n + zo, 
= (a — b)(ine,6 - n@) + Zo,. 
By substituting this into (5.22) we get 


ate (a — b)@0, + zn X 6, + K6en + 


2 ‘ 
Rr oon) econ) 
a 


The coefficients of the orthogonal vectors o,, n and n X a, in this equation 
can be equated separately to give us three scalar equations: 


6=0, (5.23a) 
ai + 62 =0, (5.23b) 
egeeY 4 RE Ag (5.23c) 


k* + a? a? + k? 
From (5.23a) we find 


6 = a = const. (5.24) 
When this is inserted in (5.23b) we find 
az+as=aB, G25) 


where f is a second constant of integration. Using these results in (5.23c), we 
obtain 
k*a ag 


Z- a gi OB - az)+ ros 2 


ws =0. (5.26) 


From this equation we can conclude that the ball oscillates vertically with 
simple harmonic motion of period 


2 2 \ 2 
= (Ate) (5.27) 
about the horizontal level 
a ag 
v4 = £(6- “8. (5.28) 


This may explain why a golfball or basketball which appears to have been 
“sunk” sometimes rises up and out of the hole. 
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Rolling and Slipping 


If the sphere is slipping as it rolls on a fixed surface, then (5.3) gives us the 
slipping constraint 


v=V-aoXn. (5.29) 


The appearance of this new variable v is offset by introducing an empirically 
based law for the reaction force of the form 


f= N(n-pnV), (5.30) 


where N = f'n 20, and the coefficient of friction uw is a positive scalar 
constant characteristic of the surfaces in contact. 

Let us consider the slipping motion for the simple case of a billiard ball ona 
horizontal table. In this case g = -ng and n-V = 0, so after (5.30) is inserted 
in (5.1), we can separate the horizontal component of the equation 


n:(f + mg) = N-mg =0 (5,314) 
from the vertical component 
mV =-uNV. (5.31b) 


By eliminating N between these two equations, the translational equation of 
motion is reduced to 


v= —Ugv. (5-52) 


Similarly, by substituting (5.30) into (5.2) and using (5.31la), the rotational 
equation of motion is reduced to 


k*@ =augnX V. (5:33) 
By eliminating ¥ between these last two equations we get 
k?o =-anX V=ainV, (5.34) 


where n-V = 0 was used in the last step. 
Next we differentiate the slipping equation (5.29) and eliminate w with 
(5.34) to get 


; jee nC aN 
v=| ti. (5.35) 
Using (5.32) to eliminate V, we get 
; ae a a 
= “18 rz ie. (5.36) 
This tells us that ¥ is constant and the speed v = |v| is determined by 


ae ad (Ha) 
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So the speed decreases linearly: 


k? =| 2 
v= y-ue | = Jr (62337) 
And slipping continues until v = 0 at time 
see ste (5.38) 
ug(k* + a’) 


A billiard ball is a uniform solid, in which case k* = 2a7/5 and t = 2u,/7ug. 
The trajectory of the ball during slipping is obtained by integrating (5.32); 
te) 


V=V,-pegtv, (5:39) 
Nae Vr ery (5.40) 


This is a parabola if, V,av # 0. This explains (in principle!) how a billiards 
trick shot artist can shoot around obstacles. 

The rotational velocity is found by inserting (5.39) and (5.37) into (5.29) to 
get 


aw X n= V,—v, + Mets, 


Note also that (5.33) implies that s = wa is a constant of the motion. 
Combining these results, we obtain 


ugat 


o =o, + an xX Vv, (5.41a) 
where 
o,=nst+a'nxX(V,-YV,) (5.41b) 


is the initial angular velocity. After slipping ceases, the ball rolls with a 
constant angular velocity 


an X V, 


oO = @ + oe ee 


(5.42) 


Rolling on a rotating surface 


Consider a sphere rolling on a surface which is rotating with a constant 
angular velocity 2. Then, if the origin is located on the rotation axis, the 
center of mass position vector X with respect to the rotating surface is related 
to the position vector X’ in the “rest system” by 


X’ = UtXU, (5.43) 
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where 
U=eo", (5.44) 
The kinematical variables in the two systems are therefore related by 
V=aHUlV+2xX)U 


Vi =U(V+2XV+R X (QX X))U 
o =] U'oOU 
o' = U(ea+2Xo)U. 


Therefore, the equations of motion in the rotating system are 
V+QXV+QX(NXX)=g4+m', (5.45) 
mk*{o+2X@)=-anxXf. (5.46) 


The “‘pseudoforces” and “‘pseudotorque” due to the rotation are explicitly 
shown. Note that the ‘apparent gravitational force” mg is a rotating vector 
related to the constant gravitational force mg’ in the rest system by 


g= U'g'U. (5.47) 
The rolling constraint in the rotating system is, of course, 
V=aw Xn. (5.48) 


By way of example, let us examine the rolling motion on a vertical plane 
rotating about a vertical axis (like an opening door). In this case Qag = 0. 
and g = g’ is a constant vector in the plane. Also, n-g = n-Q=n-V = 0. It 
will be convenient to decompose X into a vertical component 


X, = X-gg (5.49a) 
and a horizontal component 
X= BBAX. (5.49b) 


Then 2 X (Q X X) = -Q’X, , and when we eliminate the constraining force 
f between (5.45) and (5.46) we get 


k?(@ + 2X w) = -n X (V- Q?X, ~g). 
We use this to eliminate w from 
V =aoXn 


to get 


2 2 . 2 
(A= V= Dano +Q°X, +. (5.50) 
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We can determine the time dependence of n-w by using (5.46) and (5.48) to 
get 


n-@ = —n-(Q2Q X w) = -2-(w X n) = = 4 
Integrating this, we obtain 
an-'w = -2:(X—X,) + an-a,. (S291) 


Finally, by substituting this into (5.50) we get a determinate equation for the 
trajectory 


h? 


a’ 


k? + a?\- 
s]¥-00,. 


k? 
X) beget 


> (an-@, + Q2-X,)2. (3552) 
a 


This separates easily into uncoupled equations for horizontal and vertical 
displacements. The equations tell us that the ball recedes radially from the 
rotation axis with steadily increasing speed while it oscillates vertically with 
simple harmonic motion of period 


> ke 1/2 
r= 2( 4). (5.53) 


7-5. Exercises 


(onl) A homogeneous sphere of radius a rolls on the outer surface of a 
sphere of radius b. If it begins from rest at the highest point, at what 
point will the sphere lose contact? For what values of the coefficient 
of friction will slipping begin before contact is lost? 

(532) A sphere rolls on the inner surface of a right circular cone at rest 
with a vertical axis. Compare its translational motion with that of a 
heavy particle constrained to move on the same surface. Show that 
the vertical component of the angular velocity is a constant of the 
motion. 

(5337 For a sphere rolling in a spherical bowl as described in the text, 
show that the condition for steady motion in a horizontal circle of 
radius r with constant angular speed Q is 


(b — a)’s’ = 35rgQ cot 0, 


where g-n = -¢ cos 0. 

(5.4) Determine the orbit of a homogeneous sphere rolling on a horizon- 
tal turntable. Show that there are circular orbits with period com- 
pletely determined by the period of the turntable. (K. Weltner, Am. 
J. Phys. 47, 984 (1979)). Compare the motion with that of a charged 
particle moving in a magnetic field (J. Burns, Am. J. Phys. 49, 56 
(1981)). 
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(3:3) For a sphere rolling on an inclined plane which rotates with a 
constant angular velocity 2, show that, with a proper choice of 
origin, its translational equation of motion in the rotating frame can 
be put in the form 


[Da ijr+ a, xe= fy +900, 
where @, is the component of 2 perpendicular to the plane, and f is 
a linear vector function. Determine f, and discuss qualitative char- 
acteristics of the motion. 

(5.6) For a sphere rolling on the inner surface of a cylinder rotating about 
its vertical axis with a constant angular velocity, show that the 
vertical components of the motion is simple harmonic and deter- 
mine its period. 


(G7) Analyze the motion of a homogeneous sphere rolling on a horizon- 
tal plane subject to a central force specified by Hooke’s law. 
(5.8) Study the scattering of a sphere rolling on a 1/o surface of revolution 


(C. Anderson and H. von Baeger, Am. J. Phys. 38, 140 (1970)). 


7-6. Impulsive Motion 


An impulsive force F is very large during a short time interval At = ¢ — ¢, and 
negligible outside that interval. The effect of an impulsive force on a particle 
is to produce a sudden change in velocity given, from F = mv, by 


m(v-v)=J, (6.1) 
where 
i= i ' Fdt (6.2) 
ly 


is called the impulse of the force. In saying that the impulsive force 1s “large’’, 
we mean that during Af the change in velocity is significant and the effect of 
other forces is negligible. As a rule, the time interval Af can be taken to be so 
short that the change in velocity given by (6.1) can be regarded as instanta- 
neous. 

For any system of particles to which a system of impulsive forces F, is 
applied during Af, we can neglect all other forces and write 


mV = imN, = =F, 


during At. Whence the impulsive change in center of mass velocity V is given 
by 


mV-V)=2I5,, (6.3) 
where J; = mV; — Vj.) is the impulse of F; on the ith particle. 
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Similarly, during Ar the total angular momentum | = /%w of the system of 
particles satisfies 


t= 7 B. 
The r; may regarded as fixed during Ar. Therefore, the impulsive change in 
rotational velocity @ is given by 

Fao.) —— on x |, (6.4) 


where the linearity of the inertia tensor has been used. 

Though the impulsive forces produce only an infinitesimal displacement 
during Ar, they nevertheless do a finite amount of work. This produces a finite 
change in the kinetic energy which can be expressed in the form 


K—-—K, = fe MAV; — Vio) = ee es FV ig) (Vi — Vin) 
or in the form 
K-K, =+m(V + V,):(V - V,) + (w + @,):4(@ — @,). 


Therefore, by (6.3) and (6.4), the impulsive change in kinetic energy K is given 
by 


K-K,= z 2(v; + Vio) Ji (6.5) 
Or 
K-K,= ay, + V.):(25,) a (w a ©) (21; x Ji)]. (6.6) 


The impulse equations (6.3) and (6.4) apply, of course, to a rigid body, and 
they suffice to determine the effect of given impulses on the body. Details of 
the impuise forces during Af are unnecessary. In fact, such data are rarely 
available for actual impacts. The real circumstances where (6.3) and (6.4) 
apply are fairly limited. At the least, it is necessary that At be large compared 
to times for elastic waves to travel through the body, but small compared to 
the period of oscillation of the body as a physical pendulum. 

From now on, we limit our considerations to the case of a single impulse J 
delivered to a rigid body at a point with position r in the center of mass 
system. Then the Equations (6.3) and (6.4) for translational and rotational 
impulse reduce to 


mAV=J, (6.7) 

J(Aw)=rxX J. (6.8) 
In addition, from kinematics we have 

v=V+oxr (6.9) 


for the velocity v of the point at which the impulse is applied. These three 
equations are linear in the seven vectors V, V,, w, @, J, r, Vv, so they can 
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readily be solved for any three of the vectors in terms of the other four. 
Usually the initial values V,, , and r are given and the final values V, w are to 
be determined. So we have two cases of particular interest, when either J or v 
is prescribed. 


Motion initiated by a impulse 
If a rigid body at rest is set in motion by a blow, then 
mV=jJ, (6.10) 


Thus the center of mass moves in the direction of the blow, and for a 
prescribed impulse J the rotational velocity is given by 


o= £\(rXJ). (6.11) 
Inserting this in (6.9) we find 
VV nn Fr xy) (6.12) 


This tells us that the particle which receives the blow does not generally move 
in the direction of the blow. ; 

Using (6.6) we find that the energy imparted to the object by the blow can 
be expressed in the form 


K=ymV7{1+ Mx J)-4(r X JD). (6.13) 


This tells us how the energy imparted varies with the direction of the blow. 

As an example, let us examine the effect of an impulse from a cue stick on a 
cue ball in billiards. Suppose that the cue stick is stroked in a horizontal 
direction in a vertical plane through the center of the cue ball. Then (6.10) 
and (6.11) give us the scalar relations 


mV=J, (6.14a) 
Iw = Jh, (6.14b) 


h where J = (2/5)ma’, and h is the height of the 
contact point above the center of the cue ball 
(Figure 6.1). If the cue ball is to roll immedi- 
ately without slipping, then the rolling con- 
dition V = wa must be satisfied and (6.14a, b) 


; imply that 
Fig. 6.1. Cue stick impulse de- ee 


livered to a cue ball. = ae (6.15) 
The cue ball will slip as it moves if it is “high struck” (h > <a) or “low 
struck” (h < =a). The ensuing motion is determined by results in Section 7-5 
with (6.14a, b) as initial conditions. From (5.38) we find that the cue ball will 


slip for a time 
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5h -— 2a 
Ta 


i, ee 


a 
7U8 - fa | tte mg u 
after which, according to (5.39), it will have a speed V = ugt. 

Billiard balls are quite smooth, so little angular momentum is transferred 
when they collide. Moreover, the collisions are nearly elastic. When the cue 
ball strikes an object ball “head on’’, conservation of momentum implies that 
all its velocity will be transferred to the object ball. However, if it is still 
slipping, the cue ball will then accelerate from rest and follow the object ball 
if it had been high struck or retreat back along its original path if it had been 
low struck. The first case is called a ‘follow shot’’, while the second ts called a 
“draw shot”. 

The effect of striking the cue ball to the left or right of the median plane is 
to give it left or right English (= spin about the vertical axis) in addition to the 
rolling or slipping motions we have discussed. Ideally, English will be con- 
served during motion and collisions with the smooth balls, but not in colli- 
sions with the rough cushions on the billiard table. 

The mechanics of billiards was developed by Coriolis (1835), and it is used 
in the design of equipment for the game. For example, the cushions on a 
billiard table are designed to make contact with a billiard ball at a height 
h = (2/5)a above the ball’s center, so that collision with the cushion does not 
impart to the ball any spin about a horizontal axis. 


(6.16) 


Constraint on the point of contact 


An impulse may be known indirectly from its effect on the velocity of the 
particle to which it is applied. In that case, we eliminate J from (6.7) and (6.8) 
to get 


I(Aw) = mr X (AY). (6.17) 


Then, since the velocity v in (6.9) is known, the two equations (6.9) and 
(6.17) can be solved for the unknowns V and w. Eliminating V between these 
equations to solve for w, we get 


J(Aw) = mr X (v—-V,) + rer + ro}. (6.18) 
Note that (6.17) implies that 
r: (Aw) = Aw-( 4r) = 0, (6.19) 


that is, the “radial component” of the angular momentum is conserved 
through the impact. 

To complete our solution for w, we need a specific form for the inertia 
tensor .#. Let us consider the important special case where %@ = Jw and 
§ %, = Iw,, and write 1 = mk’. Then (6.18) implies 
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ro=ra,, (6.20) 
and (6.18) yields 


re k*w, + r X (v—V,) + rr-a, 


eps (6.21) 


Finally, we can get V by substituting this into (6.9), and the impulse J which 
produces this result can be found from (6.7). 

In the special case where the impulse brings the point of impact to rest, we 
have v = 0. Consequently, the motion immediately after impact is a rotation 
about the point of impact. See Exercise (6.5) for an example. 


Properties of impulsive forces 


When a ball is bounced vertically off a fixed horizontal floor, it is found 
empirically to lose a fixed fraction of its translational kinetic energy in the 
bounce, irrespective of its initial velocity over a wide range. Thus, the kinetic 
energies before and after collision with the floor are related by 


+mV? = e? (+mV32), (6.22) 


where e is a constant in the range 0 < e < 1. When e = 1 energy is con- 
served, and the collision is said to be elastic. If e = 0 the collision is said to be 
completely inelastic. The constant e characterizes elastic properties of the 
objects in collision. It is called the coefficient of restitution, because it charac- 
terizes the fact that during collision the forces of compression deforming the 
ball are greater than the forces of restitution restoring its original shape. We 
can see this by decomposing the impulse J delivered by the ball to the wall 
into two parts, 


J=Jet Ip. (6.23a) 
By definition, the compressive impulse J, brings the ball to rest. so 
—MV, = Je. (6.23b) 


Then, in accordance with (6.7), the restitution impulse J, propels the ball 
from rest to its final velocity; 


mV = Jp. (6.23c) 


The relation between the forces of compression and restitution can be 
described by 


Jn = ee. (6.24) 
It then follows immediately from (6.23a, b) that 
V=-eV,. (6.25) 
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And this implies (6.22) as anticipated. 

The value of this analysis lies in recognizing that the coefficient of resti- 
tution characterizes only the normal component of the impulse at the point of 
contact. The tangential component of the impulse derives from the frictional 
force, so it vanishes if the surfaces are ideally smooth. Experiments have 
shown that the empirical ‘‘laws of friction” are the same for impulsive forces 
as for the smaller forces between objects in continuous contact. Therefore, 
the empirical force law (5.33) implies a relation between the normal and 
tangential components of the impulse in a collision with shipping between 
surfaces. See Exercise (6.4) for an example. 

To see how the coefficient of restitution is used to characterize a collision 
between moving bodies, let us consider a collision between two balls. Let the 
balls have masses m and M and center of mass velocities V and U respectively. 
According to (6.7), the impulse J applied by the second body on the first 
produces a change of velocity to 


Ve oe. (6.26a) 
ri 


By Newton’s third law the impulse of the first on the second must be —J. 
Therefore, 


U=U,- J, (6.26b) 


and the impulsive change in the relative velocity of the spheres is given by 


m+M 
mM 


V-U=V,-U,+ a, (6.27) 
For application to the present problem, the relation expressed by (6.25) must 
be put in the more general form 


n-(V—U) =-en:(V,-U,), (6.28) 


where n is a unit normal to the balls at the point of contact. Then from (6.27) 
we find that the normal component of the impulse has the value 


mM(1 + e) 


as m+M 


n-(V, — U,). (6.29) 
If the balls are perfectly smooth, then this gives the entire impulse J = J-nn, 
and velocities after impact are completely determined from (6.26a, b). In the 
limit M—> %, (6.29) gives the impulse for collision with a moving wall. 

To see the impulsive effects of friction, let us consider a ball bouncing off a 
fixed, plane surface. Suppose that the coefficient of restitution is unity, and 
suppose that friction is sufficient to eliminate slipping during contact. A 
commercially produced ball that comes close to meeting these requirements 
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of perfect elasticity and roughness is called a Super-Ball. From (6.28) we 
obtain immediately 


n-V =-n-V,, (6.30) 


where n is the unit normal to the plane (Figure 6.2). Thus, perfect elasticity 
implies that the normal component of the velocity is simply reversed by a 
bounce. Since there is no slipping during 
contact, the frictional force will not do work, 
so the total energy, will be conserved in a 
bounce. 

For ball of radius a and moment of inertia 
I = maa’, Equation (6.8) for the angular 
momentum impulse gives us 


aakw =-n X AV (6.31) 
Multiplying this by n, we note that the 


Fig. 6.2. A ball bouncing off a fixed 


surface. scalar part give 

n-w = N1-, (6.32) 
while the bivector part can be put in the form 

AV, = aAu, (6.33) 
where 


V, = n(naV) = -n X (n X V) 
is the tangential component of V, and 
u = @ X (-an) (6.34) 


is the velocity of the point of contact with respect to the center of the mass. 

Equation (6.33) describes the conversion of linear momentum mAV, into 
angular momentum due to the action of the frictional force, while (6.32) tells 
us that the normal component of the angular momentum is conserved. 
Energy conservation puts an additional restriction on the kinematical vari- 
ables. 

To put energy conservation in its most useful form, we decompose the 
kinetic energy into normal and tangential parts by writing 


K = +m[V; + (n-V)’] + +maa|(w X n)? + (w-n)’]. 


Since the normal components (n-V)° and (n-@)° are separately conserved in 
the collision, energy conservation reduces to a relation among the tangential 
components, which can be written 


V/7+ aw = Vj, + au; . (6.35) 


1 


Thus, we have reduced the description of frictional effects in a bounce to two 
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equations (6.33) and (6.35). This is all we can learn from general dynamical 
principles. However, there is another property of the frictional force which 
we need to determine the direction of the tangential impulse. 

During the bounce, the frictional force is opposite in direction to the 
velocity 


v= V,| + w X (-an) (6.36) 


of the ball at the point of contact. Therefore, if the initial angular velocity is 
orthogonal to the plane of incidence, as expressed by the equation 


@,:(NAV,) = @'NV, — @,Von = 0, (6.37) 


then the frictional force will lie within the plane. Consequently, the velocity 
impulse AV and the trajectory after the bounce will lie in the incident plane. 
On the other hand, if w,:V, # 0, then the ball will have a velocity component 
normal to the incident plane after bouncing, that is, the ball will bounce 
sideways. 

Let us restrict our analysis to the case where the initial condition (6.37) is 
satisfied, as presumed in Figure 6.2. Then, if o, is a unit vector as indicated in 
the figure, we can write V, = V,o, and u = wo,, so (6.33) reduces to the 
scalar equation . 


Vi-V,, = a(u-u). (6.38) 
Also (6.34) reduces to the scalar relation 
u = wa. (6.39) 


Now the energy equation (6.35) can be put in the form 
Vi- Vo = —a(u? — us) 
which, with (6.38), can be reduced to the simpler condition 
V, + Vo = -(u + ub), 
or 
V+ u= {Vo + Uo). (6.40) 


According to (6.36), this says that the tangential velocity v of the contact point 
is exactly reversed by a bounce. 
Solving (6.38) and (6.40) for the final state variables, we get 


a-1 Z 
aw = ( re] | ao, >, Vo (6.41a) 


(6.41b) 


For a Super-Ball a = 2/5, and these equations become 
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S 10 V, 
a a eee | 
7) i. a (6.42a) 
4 3 
V, = = Fi aWy + a Von: (6.42b) 


As a particular example, let @, = 0. Then recalling (6.30), we find that the 
angles of incidence and rebound (Figure 6.2) are related by 


tan @ = + tan 6 


and the final spin is 


On multiple bounces, the Super-Ball exhibits some surprising behavior (Exer- 
cise (6.7)). 


7-6. Exercises 


(6.1) Under what conditions can the motion of a free rigid body be 
arrested by a single impulsive force? 
(6.2) An impulse J is applied at one end of a uniform bar of mass m and 


length 2a in a direction perpendicular to the bar. Find the velocity 
imparted to the other end of the bar if the bar is (a) free, or (b) fixed 
at the center of mass. 

(6.3) A flat circular disk is held at its center and struck a blow on its edge 
in a direction perpendicular to the radius and inclined at 45° to the 
plane of the disk. About what axis will it begin to rotate? Describe 
its subsequent motion. How would the motion be altered if the disk 
were tossed into the air before being struck? 

(6.4) A thin hoop of mass m and radius a slides on a frictionless horizon- 
tal table with its axis normal to the table and collides with a flat, 
rough, vertical wall. Initially, the hoop is not spinning and it 1s 
incident on the wall with speed V, at an angle of 2/4. After 
momentarily sliding during contact (u = coefficient of kinetic fric- 
tion) the hoop rebounds. Assuming that the coefficient of restitution 
is unity, determine the angle of reflection @ and the angular velocity 
w after collision. 

(6.5) A hoop of mass m and radius a rolls on a horizontal floor with 
velocity v, towards an inelastic step of height h(< +a), the plane of 
the hoop being vertical and perpendicular to the edge of the step. 
(Figure 6.3) 
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(6.6) 


(6.7) 


Fig. 6.4. 
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(a) Show that the angular velocity of the hoop just after colliding 
with the step is 


(b) Find the minimum initial 
velocity required for the 
hoop to mount the step if 
it does not slip. 

(c) Find the maximum initial 
velocity for which the hoop can mount the step without losing 
contact. 

A homogeneous solid cube is spinning freely about one of its long 

diagonals when suddenly an edge with one end on the rotation axis 

is held fixed. Show that the kinetic energy is reduced to one twelvth 
the original value. 

Consider a Super-Ball bouncing between two parallel planes, such 

as floor and the underside of a table. Show that with w, = 0, after 

three bounces 


WW 


Fig. 6.3. Rolling hoop colliding with a 
step. 


130 V 
ee ee Ol 

oe 343 gee 
_ 6 
= ag Vo 


showing that the motion is almost exactly reversed as in (Figure 
6.4a). What moment of inertia should a Super-Ball have if it is to 
return precisely along its original path, as in Figure 6.4b (R. 
Garwin, Am. J. Physics 37, 88-92 (1969)). 


a 


(a) (b) 


A Super Ball thrown without spin will follow the path indicated in (a), bouncing 


from the floor to the underside of a table and back to the floor. The tangent of the angle of 
bounce is 3% greater than that of the angle of incidence. For comparison, the trajectory of a 
body which returns precisely along its original path is shown in (b). 
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(6.8) A ball strikes a plane surface at an angle of 45° and rebounds at an 
angle of 45°. Show that the coefficient of friction ~ must have the 
value 


l-—e 
l+e 


a 


where e is the coefficient of restitution. 


Chapter 8 


Celestial Mechanics 


Celestial Mechanics is the crowning glory of Newtonian mechanics. It has 
revolutionized man’s concept of the Cosmos and his place within it. Its 
spectacular successes in the 18th and 19th centuries established the unique 
power of mathematical theory for precise explanation and prediction. In the 
20th century it has been overshadowed by exciting developments in other 
branches of physics. But the last three decades have seen a resurgence of 
interest in celestial mechanics, because it is a basic conceptual tool for the 
emerging Space Age. 

The main concern of celestial mechanics (CM) is to account for the motion 
of celestial bodies (stars, planets, satellites, etc.). The same theory applies 
motion of artificial satellites and spacecraft, so the emerging science of space 
flight, astromechanics, can be regarded as an offspring of celestial mechanics. 
Space Age capabilities for precise measurements and management of vast 
amounts of data has made CM more relevant than ever. Celestial mechanics 
is used by observational astronomers for the prediction and explanation of 
occultation and eclipse phenomena, by astrophysicists to model the evolution 
of binary star systems, by cosmogonists to reconstruct the history of the Solar 
System, and by geophysicists to refine models of the Earth and explain 
geological data about the past. To cite one specific example, it has recently 
been established that major Ice Ages on Earth during the last million years 
have occurred regularly with a period of 100,000 years, and this can be 
explained with celestial mechanics as forced by oscillations in the Earth’s 
eccentricity due to perturbations by other planets. Moreover, periodicities of 
minor Ice Ages can be explained as forced by precession and nutation of the 
Earth’s axis due to perturbation by the Sun and Moon. 

We have already covered a good bit of celestial mechanics in preceeding 
chapters — the one and two body Kepler problems in Chapter 4, and the 
Newtonian three body problem in Section 6-5. This chapter is concerned 
mainly with perturbation theory. The standard formulation of perturbation 
theory in general use and presented in recent texts is more than a hundred 
years old. It has the drawback of appearing unnecessarily complicated and 
difficult to interpret. This chapter presents new formulation of perturbation 
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theory which exploits the advantages of geometric algebra, developing it to 
the point where calculations can be carried out efficiently. Its effectiveness is 
demonstrated by first order calculations for the principle perturbations in the 
Solar System. We stop just short of calculating the periodicities of the Ice 
Ages, which is a second order effect. 


8-1. Gravitational Forces, Fields and Torques 


The Newtonian theory of gravitation is based on Newton’s law of gravita- 
tional attraction between two material particles, which can be put in the form, 


>» § = 88 
1 = —Gmm, (eae 5 
1 


(1.1) 


where G is the universal constant of gravitation, with the empirical value 


bo a5 
C= 66Br it 


(1.2) 
The force law (1.1) specifies the force on a particle of mass m at x due to a 
particle of mass m, at x, in an inertial system. As we have noted before, the 
potential energy of the 2-particle gravitational interaction is 


(1.3) 


and this determines the gravitational force by differentiation; 
f, =—-AV,. (1.4) 


It will be convenient to introduce an alternative formulation of gravitational 
interactions in terms of gravitational fields. 

We define the gravitational field g,(x, ¢) of a single particle located at 
x, = x,(t) by 


X — x(t) 

Ix a x,(¢)|" 
The particle at x, = x,(t) is called the source of the field and the mass m, is the 
source strength. The field g, is a function which assigns a definite vector g,(x, f) 
to every spatial point x. The field at x may change with time ¢ due only to 
motion of its source. 

If a particle of mass m is placed “‘in the gravitation field” g, at the point x, 
we say that the field exerts a force 


f, = £,(x; -) = wme(x, t). (1.6) 


Although this is mathematically identical to Newton’s force law (1.1), the 
field concept provides a new view on the nature of physical reality which has 


gi(x, 1) = —Gm, (5) 
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evolved into a new branch of physics called classical field theory. From the 
Newtonian point of view, particles exert forces directly on one another in 
accordance with the force law (1.1), though they may be separated by large 
distances. From the viewpoint of field theory, however, particles interact 
indirectly through the intermediary of a field. Each material particle is the 
source of a gravitational field which, in turn, acts on other particles with a 
force depending on their masses, as specified by (1.6). The gravitational field 
is regarded as a real physical entity pervading all space surrounding its source 
and acting on any matter that happens to be present. 

The development of gravitational field theory leads ultimately to the 
conclusion that the expression (1.5) for a Newtonian gravitational field must 
be modified, and Einstein’s Theory of General Relativity proposes modifica- 
tions which have been confirmed experimentally with increasing precision 
during the last two decades. Einstein’s theory therefore sets definite limits on 
the validity of Newtonian theory, but it also tells us that the corrections to 
Newtonian theory are utterly negligible in most physical situations. So it will 
be worth our while to study the implications of Newtonian gravitation theory 
without getting involved in the deeper subtleties of field theory. 

The concept of a gravitational field has a formal mathematical advantage 
even within the context of Newtonian theory. It enables us to separate 
gravitational interactions into two parts which can be analyzed separately, 
namely, (a) the production of gravitational fields by extended sources, and 
(b) the effect of a given gravitational field on given bodies. We study the 
production of fields first. 

The one-particle Newtonian field (1.5), can be derived from the gravita- 
tional potential 
—Gm, 

x, OQ =) ——$———— 
ae) ~ aR 
by differentiation; thus, 


2£,(x, t) a -V o,(x, t) (1.8) 


where V = V, is the derivative (or gradient) with respect to x. The gravita- 
tional potential energy (1.3) of a particle with mass m at x is given by V,(x, 
t) = m@,(x, t). However, it is essential to distinguish clearly between the 
concepts of “potential” and “‘potential energy”. The latter is shared energy of 
two interacting objects, while the former is characteristic of a single object, its 
source. 

The gravitational field g(x, ¢) of an N-particle system is given by the 
superposition of fields: 


(1.7) 


N N ae 
a0 = Seon) =-G > oo (1.9) 


On a particle of mass m at x, this field exerts a force 
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es ies 2 mg = 2h, (1.10) 


as required by the superposition law for forces. This field can also be derived 
from a potential; thus, 


g(x, t) =-Vo@(x, 2), (1.11) 
where 
= 2 x) ee 1.12 
p( ) k Pil ) k Ix =x,(1)| ( ) 
And the potential energy of a particle in the field is given by 
V(x, =m@(x,2). (1.13) 


Note that this does not include the potential energy of interaction between 
the particles producing the field. The internal energy of the N-particle system 
can be ignored as long as we are concerned only with the influence of the 
system on external objects. 


The Gravitational Field of an Extended Object 


The gravitational field of a continuous body is obtained from the field of a 
system of particles by the same limiting process used in Section 7—2 to define 
the center of mass and inertia tensor for continuous bodies. Thus, we 
subdivide the body into small parts which can be regarded as particulate, and 
in the limit of an infinitely small subdivision the sum (1.9) becomes the 
integral 


g(x, t) =-G am a (1.14) 
Ix —x’P 

where dm’ = dmx’, t) is the mass of an “infinitesimal” corpuscle (small 

body) at the point x’ at time ¢. Similarly, the limit of (1.12) gives us the 

gravitational potential of a continuous body: 


dm’ 


(1.15) 
lx—x’| 


P(x, t) =i 


Hereafter, we will not indicate the time dependence explicitly. The relation 
= -V¢ still applies here. This enables us to find g by differentiation after 
evaluating the integral for @ in (1.15). 

For a spherically symmetric body the integral is easy to evaluate. We chose 
the origin at the body’s center of mass and indicate this by writing r and r’ 
instead of x and x’ (Figure 1.1). The symmetry is expressed by writing the 
mass density as a function of radial distance only. Thus, 
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dm’ = o(r')r' dr’ dQ, 
where dQ = sin 6 dé@ d@ is the ‘‘element of solid angle’, and 


p(r) = | din = -c{" o'r’ dr’ gh 
0 


jr —r"| ote 


For r = |r| > r’ = |r’|, we can easily evaluate the integral 
dQ 7 sin? dé 4n 
ae = Pn 1.16 
lr=r’| 2 w(r? +r — 2rr' cos: 6}'" r ( ) 


And the remaining integral simply gives the total mass of the body 
R 
M= [am = an | o(r')r’ dr’. 
0 


Therefore, the external gravitational potential of a spherically symmetric body 
is given by 
(2 
o(r) = 6 [{ . ae (1.17) 
lr—r’| r 
This is identical to the potential of a 
point particle with the same mass lo- 
cated at the mass center, so the gravi- 
tational field g = —A@ of the body is 
also the same as for a particle. Since 
many celestial bodies are nearly 
spherically symmetric, this is an excel- 
lent first approximation to their gravi- 
tational fields. indeed. a sufficient Fig. 1.1. Points inside the sphere are denoted 
approximation in many circumstances. by the primed variable; external points are de- 
A more accurate description of noted by the unprimed variable. 

gravitational fields is best achieved 
by evaluating the effects of deviations from spherical symmetry. We expand 
the potential of a given body in a Taylor series about its center of mass. Since 
r>r', we have 


posit EP Met 
=441 + ¥ (=| P, (et? i" (1.18) 
r n=1 r 


where the P,, are the Legendre polynomials (see Exercise (8.4) in Section 
2-8). We will need explicit expressions only for 
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Par) = rr 
PArr’) = 3 ([3(rr’)? - rr), 
Por’) = +154’) = 3r?r?r-r’]. (1.19) 


The last line of (1.18) shows that the magnitude of the nth term in the 
expansion is of the order (r’/r)", so the series converges rapidly at a distance r 
which is large compared to the dimensions of the body. The expansion (1.18) 
gives us an expansion of the potential, 


g(r) = - s : [rer dm’ + 5 [rer Us See }. 
By (1.19), 
fewer’ dm’ = rife dm'| = r-[0] = 0, 


since the center of mass is located at the origin. Recall (from Section 7-2), 
that the inertial tensor ¥ of the body is defined by 


r re 


[M+ 


Jr = | omer e'ne = [om (rr ror ry. (1.20a) 
and the trace of the inertia tensor is given by 
Tr f= 2 amir? = 1 +h +h, (1.20b) 
where /,, L,, /, are principal moments of inertia. Therefore, 
am Pr-r’) = [an +(3(r-r’) - r?r'?] 
=s(7r Tr J=3r-9r) = > rh, (1.21) 
which defines a symmetric tensor 


yr =rTr I-3 Sr. (1.22) 


Adopting well-established terminology from electromagnetic theory, we may 
refer to U as the gravitational quadrupole tensor. 
Now the expanded potential can be written 


oe) =- 2) meg Ped (1.23) 


This is called a harmonic (or multipole) expansion of the potential. The 
quadrupole term describes the first order deviation from the field of a 
spherically symmetric body. From this the gravitational field g = —-V@ can be 
obtained with the help of 
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MG ror) = or, 
Vr" =nr"'t 


Thus, 
G ce 8 arin 
g(r) = - 7 | Mf - a (Gf -—+@-(Vr)f) +... (1.24) 


This expression for the gravitational field holds for a body of arbitrary shape 
and density distribution. 

From Section 7-2 we know that the inertia tensor for an axisymmetric body 
can be put in the form 


Sr =J] r+ U,-J],)r-uu, (025) 


where J, = J, is the “‘equitorial moment of inertia’, /, is ‘‘polar moment of 
inertia’, and u = @ is the direction of the symmetry axis. Then (1.22) and 
(1.20b) gives us 


gr = (/,-1,)(r-3r-uu). (1.26) 


From (1.24), therefore, the gravitational field for an axisymmetric body is 
given by 
g(r) = -MGI t+ (4 \"ta — 5(#-u))# + 2u-tu] +... (1.27) 
where R is the equitorial radius of the body and J, is defined by 
= I, 7 I, 
= aR? 


(1.28) 


The constant J, is a dimensionless measure of the oblateness of the body, and 
the factor (R/r)’ in (1.27) measures the rate at which the oblateness effect falls 
of with distance. 

For an axisymmetric body the effect of harmonics higher than the quadru- 
pole are not difficult to find, because the series (1.18) integrates to a harmonic 
expansion for the potential with the form 


oe) =- M11 $4,(2) Pew}, (1.29) 


where the J,, are constant coefficients. As mentioned above, J, is a measure of 
oblateness and is related to the moments of inertia by (1.28). The constant J, 
measures the degree to which the body is ‘“‘pearshaped’’ (i.e. southern 
hemisphere fatter than northern hemisphere). The advantage of (1.29) is that 
it can be immediately written down once axial symmetry has been assumed, 
and the J,, can be determined empirically, in particular, from data on orbiting 
satellites. For the Earth, 
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J, = ABS3: x 10° 


J, =-2.5 x 10° 
J, = -1.6 x 10° 
J,= -0.2 x 10°. (1.30) 


Clearly, the quadrupole harmonic strongly dominates. Although for n > 2 
the J,, are of the same order of magnitude, the contributions of the harmonics 
decrease with n because of the factor (R/r)” in (1.29). Since the J, are 
independent of radius, comparison of the J,, for different planets is a meaning- 
ful quantitative way to compare shapes of planets. 


Gravitational Force and Torque on an Extended Object 
The total force exerted by a gravitational field on a system of particles is 
F(t) = 2m,g(x,(1), 0), (1.31) 


where x,(t) is the position of the kth particle at time ¢. In the limit for a 
continuous body this becomes 


f= [ome g(x), (1.32) 


where the time dependence has been suppressed. 
For the force on an extended body due to a particle of mass M at the origin, 
(1.32) gives us 


r+r’ 


rary : (E33) 


ed ee 
where r is the center of mass of the body and the variable of integration has 
been changed from x to r’ (Figure 1.2). In accordance with Newton's third 
law, the expression (1.33) for the force of a particle on a body differs 
only in sign from the express- 
ion (1.14) gives for the force 
of a body on a particle. Conse- 
quently, the result of approxi- 
mating the right side of (1.33) 
by expanding the denomina- 
tor can be written down at 
once from previous approxi- 
mation of the gravitational 
field. From (1.24) with (1.22) 


Fig. 1.2. Integration variable for the gravitational WE get 
force of a point particle on an extended body. 
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p= me + (Tr #- ses +3 sel. (1.34) 


To second order, this is the gravitational force of a particle (or a spherically 
symmetric body) of mass M on an extended body with mass m and inertia 
tensor ¥. This is quite a good approximation for many purposes in celestial 
mechanics. 

The gravitational torque on a body (with center of mass at r as base point) 
is given by. 


r=] dm’ xe=nrxt+ [dmx xe. (1.35) 


where r’ = x —r, as in Figure 1.2. If the field is produced by a particle at the 
origin, then x X g = 0. Therefore substitution of (1.34) into (1.35) gives us 


3GM 


r° 


= 


ee ge (1.36) 


This is a useful expression for the torque on a satellite. Note that it vanishes 
identically if the body is spherically symmetric. 


Tidal Forces 


In the preceding subsection we examined the gravitational force and torque 
on a body as a whole. Besides these effects, a nonuniform gravitational field 
produces internal stresses in 
a body called tidal forces. To 
have a specific example in 
mind, let us consider tidal 
forces on the Earth due to 
the Moon. 

We aim to determine the 
tidal forces on the surface of 
a spherical Earth (Figure (3). Fig. 1.3. Geocentric coordinates for the Earth and 
For a particle of unit mass at Mo": 
rest at a point R on the surface of the Earth, we have the equation of motion 


R R- 
ae Gm—— R- at +f=a. (137) 
The first two terms are the gravitational attractions of the Earth and Moon 
respectively, and fis a force of constraint due to the rigidity of the Earth. The 
term a on the right is the acceleration of the noninertial system in which the 
particle is at rest. Considering only the two body motion of the Earth and 
Moon about their common center of mass, we have 
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a= Gm : (1.38) 

To this we could add the centripetal acceleration due to the rotation of the 

Earth, but we have already studied its effect in Section 5.6, and it is com- 

pletely independent of tidal effects. The effect of the Earth’s rotation on the 

tides is due entirely to the rotation of the position vector R, as we shall see. 

Inserting (1.38) into (1.37). we find that the constraining force required to 
keep the particle at rest is given by 


a a 
RS 


R- 
wate ZI, (1.39) 
IR-r or 
The negative of the last term is the tidal force. Since the radius of the Earth 
R = |R| is far greater than the Earth-Moon distance r = |r|, we can employ 
the binomial expansion 

1 1 1 ( rR 


Rore (?-2eR+ RAZ 77 
So, as a first order approximation for the tidal force g, = g,(R), we have 


r 


3 


R-r 
= —GH | ——_—_ + 
g(R) m| a : 


G 
~ —~ (3R-##-R). (1.40) 
= 


This expression applies to every point on the surface of the Earth; the 
distribution of forces is 
shown in Figure 1.4. Note 
that the Earth is under 
tension along the Earth- 
Moon axis and under 

Moon _ compression perpendicu- 
lar to the axis. The sym- 
metry of the tidal field 
may be surprising, in par- 
ticular, the fact that the 
tidal force at the point 
closest to the Moon has 
the same magnitude as at 
the furthest point. This symmetry disappears in higher order approximations, 
but they are negligible in the present case. 

In the Earth-Moon orbital plane, the tidal force (1.40) has a component 
tangent to the Earth’s surface except at the four points 6 = 0, 7/2, 7, 37/2. 
Consequently, water on the surface of the Earth piles up at these points into 
two high and two low tides. Since the Earth’s axis is nearly perpendicular to 
the Earth-Moon orbital plane, the tidal bulges are swept over the surface of 
the Earth by the Earth’s rotation, producing the semi-diurnal tides, that is, 


Fig.1.4. The tidal force field on the surface of the Earth. 
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alternating high and low tides every twelve hours. A tidal bulge at one point 
on the Earth rotates ‘‘out from under the Moon” before it collapses, since 
frictional forces retard its collapse as well as its build-up. Consequently, an 
observer on the Earth will see a time lag between the appearance of the Moon 
overhead and the maximum tide (Figure 1.5). 

Tidal friction  dissi- 
pates the Earth’s rota- 
tional energy, thus re- 
ducing the Earth’s angu- 
lar velocity and gradually 
increasing the length of a 
day. At the same time the Fig. 1.5. Tidal lag. 
tidal force reacts on the 
Moon to accelerate it. This increases the Moon’s orbital energy, so it recedes from 
the Earth, and its orbital period gradually increases. Though energy ts dissipated, 
the overall angular momentum of the Earth-Moon system is conserved under the 
action of tidal forces. So tidal forces drive a transfer of spin angular momentum to 
orbital angular momentum. This process will continue until the length of the day 
equals the length of the month, and the vanishing of tidal friction prevents further 
angular momentum exchange. However, solar tides will continue to slow down the 
Earth’s rotation, so the Moon will begin to approach the Earth again with an orbital 
period locked in synchrony with the Earth’s rotation. 

Observational evidence leads to a value of 4.4 cm" yr for the rate of tidal 
recession of the Moon. By angular momentum conservation, one can infer 
from this a rate of change in the length of day of about two milliseconds per 
century. Techniques for measuring such small time changes have been devel- 
oped only recently. The rate of tidal energy dissipation corresponding to such 
a change is on the order of 107 W. 

A closer look at the tidal phenomena raises many questions which are 
subjects for active geophysical research today. By precisely what mechanism 
is tidal energy dissipated, and how accurately can it be estimated from 
geophysical models? What is the relative effectiveness of surface tides and 
body tides in dissipating energy? Has the rate of tidal energy dissipation been 
uniform over geologically long times? Then how close to the Earth was the 
Moon in the distant past? What bearing does this have on theories of the 
Moon’s formation? 

Of course, the tidal mechanism is operating throughout the solar system. In 
general, satellites in subsynchronous orbits spiral in towards the parent 
planet, while satellites in suprasynchronous orbits spiral outward. The moons 
of Mars, Phobos and Deimos, fall, respectively, into these two categories. 

Now let us examine a tidal effect which has been ignored in our discussion 
so far. The inclination of the Moon’s position with respect to the equator 
varies between 18° and 29° each month. To describe its effect on the distri- 
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bution and periodicity of the tides, we consider 
the radial component of the tidal force, which, 
by (1.40), has the value 


A GMR 
R:g, a r3 


(3(R-#)? — 1]. (1.41) 


The tidal periodicity is completely determined by 
the term in brackets. We can analyze it by intro- 
ducing equitorial coordinates as shown in Figure 
1.6. Let @ be the colatitude and @ the longitude 
of a fixed point R on the Earth, and let 6,, and 
be $y be the corresponding coordinates for the di- 
for the position of the Moon and a : “ : 
point on the Earth. N indicates aioe aig Lepin ith wie si 
north and ¢, designates the zero of COSines from Appendix A to the spherical tri- 
longitude on the prime meridian. angle in Figure 1.6, we obtain 


Fig. 1.6. Equitorial coordinates 


R-# = cos 6 cos 6,, + sin Asin 0, cos(P— dy), (1.42) 
whence 
[3(R-f)? — 1] = > sin’ @ sin’ 6,, cos 2( — Py) 
3 
a 


+ + sin 20 sin 26,, cos(o — dy) 


+ + (3 cos? @—1) (3 cos? 6,,—1). (1.43) 


The three terms in this sum show three different periodic variations in the 
tidal force. With @ = Qtr, the first term shows the major effect of a semi- 
diurnal periodicity. The second term has diurnal periodicity, while the third 
oscillates twice a month due to the motion of the Moon. 

The Sun produces tides on the Earth in the same way as the Moon. From 
(1.40), the relative magnitude of solar and lunar tidal forces is 


My (2) ory (1.44) 
My \iu 

The Sun and the Moon combine to produce high “‘spring”’ tides and low neap 
tides. 


The Shape of the Earth 


The shape of the Earth plays an important role in cartography and many 
geophysical phenomena, so we need a precise way to characterize it. A 
theory of the Earth’s shape is developed by supposing that it originated from 
the cooling of a spinning molten mass to form a solid crust. Regarding the 
oscillating tides induced by other bodies as secondary effects to be considered 
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separately, we model the Earth as a spinning fluid held together in steady 
state equilibrium by gravitational field. In a geocentric frame spinning with 
the Earth the fluid will be at rest, with an effective gravitational potential at its 
surface of the form 


P(r) = o(r)-7 (2X ry, (1.45) 


where @(r) is the true gravitational potential and the last term is the centrifu- 
gal pseudopotential. The gravitational field 


a> (1.46) 


must be normal to the surface, for if it had a tangential component that would 
force fluid to flow on the surface. That means the surface of the Earth is an 
equipotential surface defined by the equation 


@M(r) = K, 


where K is a constant to be determined. 

Since the spinning fluid is axisymmetrical, its gravitational potential @ can 
be described by the Legendre expansion (1.29), so, to second order, the 
shape of the Earth is described explicitly by the equation 


P(r) = ae 1- 4,(4] ‘Blew? - 1] | — 7 Q?r’[1 - (#-u)"] = K, 


(1.47) 
where u = Q specifies the rotation axis and a is the equitorial radius of the 
Earth. The surface described by this equation is called the geoid. Its deviation 
from a sphere is characterized by a parameter € called the flattening (or the 
ellipticity) and defined by 

a-c 
c 


E= 


(1.48) 


where c is the polar radius of the Earth. The constant K in (1.47) is evaluated 
by setting y = c with uf = 1, yielding 


k=- SMe) (1.49) 


The flattening ¢ can be expressed in terms of the other parameters by setting 

r = a and f-u = 0 in (1.47). That gives 

-GM{ , 1g 
Cc 


= 


1+ t.|- 720 = ja ae) 


Since € and J, are known to be small quantities, it suffices to solve this 
equation to first order, and we get 


e=2J,+46, (1.50) 
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where 


Q?a? QQ?a 


GM GM/a’ 


is the ratio of centripetal force to gravitational force at the equator. 
The expression for the geopotential can now be written in the form 


or) = SM 1 +(e 6)(£) te - em + t6(Z] ear. 


(1.52) 


This can be used to determine the flattening from empirical data. For the 
gravitational acceleration at the pole g, and at the equator g,, it gives 


Pe (31) 


GM 
8 =—,-(1 + B), 
a 
ge= SF + e- 5p), (1.53) 


Gravimetric measurements give the values 


8, = 983.217 cm/sec’, 
g. = 978.039 cm/sec’. (1.54) 


Using values for the a and Q2 from appendix C, from (1.53) we calculate 


€ = 0.003376, 
B = 0.003468. (1.55) 


As a check on the internal consistency of the theory, in (1.50) these numbers 
give a value for J, wich agrees with the values (1.30) from satellite data to 
better than one percent. 

The shape of the Earth as described by the geoid (1.47) agrees with 
measurements of sea level to within a few meters. However, radar ranging to 
measure the height of the ocean is accurate to a fraction of a meter. So 
geodocists are engaged in developing more refined models for the shape of 
the Earth. The main deviation from the geoid is an excessive bulge around the 
equator. This has been attributed to a retardation in the rotation of the Earth 
over past millions of years — one more clue among many to be fed into a 
conceptual reconstruction of the Earth’s history. 


8-1. Exercises 


9) For an axisymmetric body, a harmonic expansion of the gravi- 
tational field can be put in the form 
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(1.2) 


(1.3) 


(1.4) 


(1.5) 
(1.6) 
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—MG 


r? 


g(t) = (r+ 3 en (r)) 


g.(r) is given in Equation (1.27). From Equation (1.29), show that 
aie) =$4(£) (rw) + 3wye + Guy ~ Eu 
Show that to second order the gravitational force of one extended 


body on another is given by 


—-Gm,m, 


r° 


T= 


r— m,§, — M28), 
where 
-G 3 at ge 
= = | Sxrt+s (Tr I, — St: 4,F)| 


and ¥, is the inertia tensor for body k. 
The variation Ah in the water level from low to high tides can be 
derived from Equation (1.41), assuming that the water is in equilib- 
rium in the unperturbed field -GMR/R’ . Show that variations at the 
equator are of the order 
R* 

ae M t 
This is comparable to the variations observed on small Pacific atolls 
which approximate open ocean conditions. How does AA vary with 
latitude? 
The periodicity of the first term in Equation (1.43) is not quite 
diurnal, since the angle 2(@ — ¢,,) depends on the motion of the 
Moon as well as the rotation of the Earth. Show that its actual 
periodicity is 12h and 26.5 min, so high tide is observed about 53 
min later each day. 
Derive the Equations (1.53) and check the computation of (1.55). 
The geoid is nearly an oblate spheroid. That can be established by 
considering the equation for an oblate spheroid, 


= 0.5 m. 


r-u) r Xu) re b7 ="). 
l= ur de exw =e j= | a Jeuy 
For small e° = (b — c)/c, this can be put in the approximate form 
hee eS bit + ey. 


(1 — 2e'(F-u)y]™ 


Show that to first order in € the Equation (1.52) for the geoid can be 
put in this form, and relate the parameters € and a there to the 
parameters e’ and b here. 


Perturbations of Kepler Motion S27 
8-2. Perturbations of Kepler Motion 


The Newtonian two-body problem is the only dynamical problem in celestial 
mechanics for which an exact general solution is known. We call the motion in 
that case Kepler motion. Approximate solutions to a large class of more 
difficult dynamical problems are best characterized as perturbations (or dis- 
turbances) of Kepler motion. To that end, we write the translational equation 
of motion for a celestial body (planet, satellite, spacecraft, etc.) in the form 
: r 

elt (2.1) 
where f is referred to as the perturbing force (per unit mass). The perturbing 
force is said to be small if | f | << p/r’, that is, if it is much smaller in 
magnitude than the Newtonian force. If the primary body is large enough to 
be regarded at rest, then wu = MG in (2.1), where M is the mass of the 
primary. Otherwise, should include the two-body correction determined in 
Section 4-6. We can always insert the two-body correction at the end of our 
calculations if the degree of precision requires it. 

Gravitational perturbation theory is concerned with general methods for 
solving Equation (2.1) for any specified perturbing force. Several methods 
have been widely employed for a long time. However, we shall develop here a 
new coordinate-free method exploiting the advantages of geometric algebra. 
A more sophisticated method will be developed in Section 8-4. Instead of 
attacking the equation directly, it is best to reformulate the problem by using 
our knowledge about Kepler motion to take the Newtonian force into account’ 
once and for all. Then we can analyze the effect of the perturbing force 
separately. 

From our study of the Kepler problem in Section 4-3, we know that the 
instantaneous values of the position and velocity vectors r and v determine a 
unique Kepler orbit, which may be an ellipse, a hyperbola or a parabola. The 
orbit is completely characterized by an angular momentum vector (per unit 
mass) 


h=rXv (2.2) 
and an eccentricity vector e given by 
u(e +f) =v X h= thy. (2.3) 


The vectors h and « are called orbital elements. Although these two vectors 
characterize the orbit completely, we have seen that it is useful to classify 
orbits by the values of another orbital element, the energy (per unit mass) E, 
given by 
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Unique values for the orbital elements h actual—* 
and e are determined at each time t by orbit -~—~ 
(2.2) and (2.3), even in the presence of a 
perturbing force f. They specify a Kepler 
orbit instantaneously tangent to the actual 
orbit at the point r = r(t). This Kepler 
orbit is called the osculating orbit of the 
motion. Since h and é are constants of Fitze2otee Thietmar (CRE RE 
the unperturbed motion, the osculating is instantaneously tangent to the true 
orbit will be identical with the actual orbit at the point r = r(0). 
orbit when f = 0. When f # 0, h and « 
are no longer constant, so the osculating orbit must change continuously in 
time. We can picture the perturbed motion as motion of a particle on a Kepler 
orbit which is being continuously deformed by the perturbing torce. This 
picture is of great value for understanding the effects of perturbations. 

To describe the deformation of an osculating orbit analytically, we need 
equations of motion for e and kh. These are easily found by differentiating 
(2.2) and (2.3) and eliminating derivatives of v and fF using (5.1) and the 
identity 


osculating 
orbit 


dr ith hxXr 
a a ek” >) 


established in Section 4-3. 
In this way, we obtain the coupled equations 


berx ft (2.6) 
vé = i(hv + hf) =v Xh+f Xh. (237) 


Since h and ¢ are constants of the unperturbed motion, they will be slowly 
varying functions in the presence of a small perturbation, and approximate 
solutions to their equations of motion (2.5) and (2.6) will be easy to find. This 
is the main reason for considering perturbations of orbital elements rather 
than working directly with the Newtonian equation of motion (2.1). How- 
ever, our formulation of perturbation theory can be further improved. 

The main drawback of a perturbation theory for h and ¢ is the fact that 
these vectors are not independent of one another. Each vector alone is 
equivalent to three scalar elements, but together they are equivalent to only 
five because of the orthogonality condition h-e = 0. We can eliminate the 
redundancy due to this constraint by introducing a spinor R determined by 
the equations 


h = R'o,R = e,. (2.8a) 
é= R'a,R=e,. (2.8b) 


These equations determine a dextral frame 
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e, = Rte,R (kK = 1, 2, 3), (2.9) 


which we call a Kepler frame, because it specifies the attitude in space of the 
osculating Kepler orbit. Given a fixed frame {o,}, the attitude of the osculat- 
ing orbit is completely determined by the spinor R, so let us refer to R as the 
attitude element. 

Instead of the redundant vector elements h and €, we can work with the 
independent elements R, h = |h| and ¢ = |e|. This choice has additional 
advantages of a direct geometrical meaning. While the attitude of the oscu- 
lating orbit is described by the spinor R, its size and shape are described by h 
and ¢. Actually, the orbit size is directly described by the Kepler energy 
element E defined by (2.4), while ¢ is the shape parameter. With (2.4) we can 
eliminate any one of the elements h, ¢€, E in favor of the other two, but it is 
best not to commit ourselves to a particular choice prematurely. 

From our study of rotational kinematics, we know that the attitude element 
R obeys an equation of form 


“R= + Rio, (2.10) 
so we need to determine w from the perturbing force. Because of (2.8a) and 
(2.8b), the derivatives of h and € can be put in the form 

h=oXh+hh, (2.11a) 

é=oXert &. (2.11b) 
These equations can be solved for as follows: First eliminate the unwanted 
h and é by the multiplications 

h X h =h X (w X h) = h-(ha@) = bk w—- hoh, 

eXée= eX (wm X @) = & (AW) = Fw- Ewe, 


The first of these equations can be solved for w if we get an independent 
relation for h-w from the second. Since w-h = 0, we get 


(e X @)-h = &o@h. 
Therefore, 


hXh ce (e X &)-h 
h? cn 


o= h, (2.12) 
This is, of course, a general kinematical result giving the rotational velocity of 
the frame determined by any two time dependent orthogonal vectors. 

To ascertain how w-depends on the perturbing force we insert the equation 
of motion for h and ¢ into the kinematic formula (2.12). Note that (2.6) give 
us 


hxXh=hX (rf) =hfr, (2.13) 
since h-r = 0. And (2.7) gives 
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ue X & =e X (vXh) + eX (f X h) = evhv - e-vh - e-fh. 
Hence, 

u(e X &)-h = -e-v(r X f)-h—-h’e-f, . (2.14) 
Substituting (2.13) and (2.14) into (2.14), we get the desired expression 


oe ees fh + HRM, (2.15) 
a perh- 


If desired, we can eliminate v from this expression by using (2.3), which yields 
the relations 


ey = ue Xe) = —vi =. (2.16) 


We can interpret (2.15) by regarding the entire osculating orbit as a rigid 
body lowly “spinning” in space with rotational velocity w and symmetry axis 
along h. The physical significance of the two terms on the right side of (2.15) 
can be identified at once. The first term h °h X (r X f) = h * hf r describes 
an instantaneous rotation about the radius vector r. This can only tilt the 
orbital plane and the symmetry axis of the orbit, just as the axis of an axially 
symmetry spinning rigid body is tilted in precession and nutation. The last 
term describes an instantaneous rotation in the orbital plane about the 
symmetry axis along h, a motion called pericenter (perihelion or perigee) or 
apse precession by astronomers. Apse precession 1s most simply characterized 
mathematically as a change in direction of the eccentricity vector € in the 
orbital plane. Note that the apseprecession coefficient in (2.15) shows the 
contribution of angular momentum change in the term with (r X f)-h= 
h-h = hh. 

We get an equation of motion for the attitude of the osculating orbit by 
inserting the explicit expression (2.15) for w into the spinor equation R = 
Riw/2. Also, we need independent equations for the orbital size and shape 
parameters. We can obtain such equations easily from (2.1la, b) by using 
h-h = hh and e-é = eé. Thus, we obtain 


= (r X f)-h=f-(h Xr) (2.417) 
and 
ueé = (r X f)-(e X v) + f-(h X &) 
=f[(eXv)xXrt+hX el]. . (2.18) 
One of these equations can be replaced by the energy equation 
Bayt (2.19) 


which is most easily derived from (2.1) in a manner we have noted before. 
To complete this formulation of perturbation theory we need to add an 
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equation for the effect of perturbations on the time of flight along the 
osculating orbit. However, we shall skip that, because it will not be needed, 
for the particular problems which we shall consider, and the method of 
Section 8-4 is probably better for that purpose anyway. 


Orbital Averages 


In many problems of satellite motion the variation of orbital elements is slow 
compared to the orbital period. In such problems we can simplify our 
perturbation equations considerably by averaging over an orbital period while 
holding the orbital elements fixed. This time-smoothing procedure eliminates 
oscillations in the orbital elements over a single orbital period. It eliminates r 
and v from the perturbation equations, reducing them to equations for the 
orbital elements R, h, € and E alone. We shall refer to the resulting time- 
smoothed equations as secular equations of motion; since the changes in the 
orbital elements they describe are called secular variations by astronomers. 
Secular equations are most appropriate for investigating long-term pertur- 
bation effects. 
The orbital average f of a physical quantity f = f(t) is defined by 


f= + | “f) dr, (2.20) 


where T is the period of orbital motion, and the orbital elements are held 
constant in the integration. To compute the average f, therefore, f must be 
expressed as an explicit function of the orbital elements. After averaging, we 
release the time dependence of the orbital elements so f becomes a function 
of time. 

Our secular equations of motion for the orbital elements can be written 


R =+ Rio, (2.21a) 
where 
pe ee) _ emer h + het). (2.21b) 
he ue-h- 
Also, 
ean RT (2.22) 
ucé = e-[v X(t Xf)] + (hX e)-f, (2.23) 
pase (2.24) 


When a definite perturbing force function f is given, the indicated time 
averages can be performed, and these become definite differential equations 
to be solved for the time dependence of the orbital elements. 
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To facilitate applications of the theory, we collect relations needed to 
evaluate orbital averages efficiently and establish specific results which will be 
useful in calculations later on. Orbital averages are often easier to compute 
when the independent variable is 
an angle instead of time, because 
the parametric representation of 
the orbit is simpler. So we need 
explicit parametric represen- 
tations for the dependent variables 
of interest and relations among 
the alternative parameters. The 
relations we need were derived in 
the first part of Chapter 4, primar- 
ily in Section 4-4, so we can just 
write them down here. 

The angle variables of interest, 
6 and @, relate the position vector 
to the orbital elements as shown in 
Figure 2.2. Astronomers call 6 the 
true anomaly and @ the eccentric anomaly. The dimensionless time variable 


Fig. 2.2. Parameters for an elliptical orbit. 


a= a _1) (2.25a) 


is called the mean anomaly, and Tis the initial time of pericenter passage. The 
orbital frequency parameter 


n= a (e! (2.25b) 


is called the mean motion by astronomers. For parametric representations, 
the following relations are useful: 


r= a(cos d-€)+bsin®@, (2.26) 
ér = a(cos @— €) + ibsin d = re’, C27) 
where i = ih, 4 = @, b = ai = h X 2. 
a bla = 7 
, See a(1—- € cos ¢), (2.28) 
Pee) = (2.29) 
aaa oe +) aie 
U | eee + a (2.30) 


The real anomaly is related to the time parameter by angular momentum 
conservation; 
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»  h _ 2nab 
6 = eer (2.31) 
By differentiating Kepler's Equation (44.4) and using (2.28) we find 
. 2a 
ee (252) 


These last two equations can be reexpressed as the differential relations 


dt _rdd@_rdé_ dM 
T 2ma 2nab In © os) 


This enables us to express the time average in any of the equivalent forms 


= 1 {7 Lal 2a > ale ony. = ] 27a 
ie r | fae= 20 i mas 2. [' ee aa eee 


(2.34) 
where r can be expressed in terms of 9 or @ by (2.28) as needed. 
To illustrate averaging calculations, we work out a few examples. 
FS = [* P49 = a "(1 ~ € cos ¢) db 
27a Jo 27 0 
pal. il do - 2e |" cos ¢ dd + e[” cos’ b 4| 
27 0 a 0 
= SOT = O + e-m}. 
20 
Hence, 
F=a(l+te). (2.35) 
In general, 
as a (1-—ecos ¢)"'' do 
prt aa n 
=a = = cos @(1 - € cos@)”" d¢. (2.36) 
T Jo 


When the numerator is expanded here, odd powers in cos @ can be discarded, 
since 


ie cos‘ @ dd = 0 
0 


if k is an odd integer. From considering (2.28), it is evident that 6 is a more 
convenient parameter than @ in these cases, because the cosine appears in the 
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numerator instead of the denominator. The opposite is true in the next 
example. 


i 1 2 yr? dO 
= dé, 
| r 2mab i r 
Hence 
il . a 1 > iA 
— = — Siu ot do) pre 23 
(1) =Aa+ter-f (2.37) 


Notice the reciprocity between @ and ¢ in the integrals for (2.35) and (2.37). 
Considering (2.28), it is evident that this reciprocity is a general relation, and 
it is not difficult to prove that 


ai =( el for n23. (2.38) 


Results of computations by the above method are given in Table 2.1. This 
straightforward method is adequate for computing any desired orbital aver- 
age, but we shall develop an elegant alternative approach. 


TABLE 2.1. Orbital Averages 
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5 

—— 8 Sees —————— 

r'r = ates r? = pnt 
a 


r = & 
A = e7b2m5 


For an arbitrary constant vector u = u,e, + ue, + ue, where e, = &, e, = e, X e,, e, = h. 


| wir ] 1 | Be 2b? 1 
= == 4) ee | S= ts 
rn ee ar? ar™! ae: 


- aaa for n<4 
ad 


or, forn = 5 


| 7a 7 oe {ujeAz + uzex(A,- €°B3)}. 


no not 


where © Ay =i. — 2 ees 


> 


a~ a 


For arbitrary constant vectors u and w, 


wruwrr I 2 
| ee = ep {u,w,e,A, — [u,wre, + (uw, + U.W,)e.] (A; — €B,)}. 
r" 3 r! 4 ri? § =a 
where A,= ——-3 —-+3-—-r"°,. 
a a a 
yo n-G 
B,= MiG mse @ 
a 


urr = 5 {u,e,(1 + 4e*) + u,e,(t - E)Y = 5 {u(l-e°) + S5ee-u} 


= 


< {u,w,e,(3 + 2e°) + [uswre, + (uw, + uw,)e,] (1 - €°)} 


ed 


wrurr=-— 4 


Orbital averages can be systematically computed by exploiting the relation 
ihv=u(er+f). (2.39) 


Since v is a time derivative its orbital average vanishes; explicitly, 


haan ae = 
v-4[ ao r fee 0. 


Therefore, from (2.39) we immediately get 
f=-s. (2.40) 


Note that this implies the relations 


AA 


ete —) 1COS1G) — tee 
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PCr hasnt 0.. 


To transform (2.39) into a form for computing general orbital averages, we 
multiply it by r and separate scalar and bivector parts. The scalar part can be 
written 


u(er + r) = ihvar = h? = a ; 
or 
a 
a reid (2.41) 
The bivector part gives us 
car = tht = ih—. (2.42) 
lu lu 
Adding these two equations and solving for r, we obtain 
2 : a é 
r=e' (2 r+ mH) = 2+ (2 ~r janx ete (2.43) 
a u a u 
Since 2rr = dr*/dt, this gives us the time average 
——— 2 —_— 
poet(h - 17 \=- ae, (2.44) 


with the help of (2.29) and (2.35). More generally, (2.43) gives us 


2 zi ° 
=e owe clea: 
r ar 7 u r 


Since the last term here is again a total time derivative, it averages to zero, 
and we get 


Ta a ca 
ar 4 _ —— 
ie a\r ae 
Using (2.33), for n 2 4 this can be put in the slightly simpler form given in 
Tableau, 


Now let u be an arbitrary constant vector. With a Kepler frame (2.9) as 
basis, we can write u in the expanded form 


(2.45) 


[= Ua) or PGs aP Ub@s. 


where u, = ue,. Using (2.43) in the form 


wane ¢ (2-1) +e, : (2.46) 
€ a u 


we get 
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gs red 


urr = 4 {ue (2 =F ) sr ee | 
ce a a 
eer we) (2 -r i | | (2.47) 
ss a U 


The last term will yield a vanishing orbital average even when multiplied by 
an arbitrary polynomial in r. Therefore, (2.49) gives us 


es 4 a a Pp iad 
or - LY ue, b Ber) tue 2 | 


he € ar a a 
(2.48) 
The average of (rr)° can be computed from (2.30), which gives us 
We = (2 ors r). (2.49) 
u a a 


Then, for n > 5, (2.38) can be used to put (2.48) in the useful form given in 
Table 2.1 

We now have all the techniques needed to reduce the orbital average of any 
homogeneous function of r to averages of powers of r = |r|. Let us consider 
one more example, which will be needed for our calculations later on. 
Dotting (2.47) with an arbitrary constant vector w and multiplying by (2.46), 
we get 


1 2 : 
wrwrr = ie u,w,e, Sie =f 


b? Be? (ri)? d 
+ [u,w,e, + (u,w, + uwyel( = fF la 7 + ant | 


(2.50) 


Proceeding as before to compute the orbital average, we get the result in 
Table 2.1. 


Astronomical Coordinates 


Our specification of orbital elements and equations of motion is mathemati- 
cally complete. However, we need to relate our attitude element to a 
conventional set of orbital elements to facilitate comparison of our results 
with observations and our theory with conventional perturbation theories. 
To measure the attitude of a satellite orbit in space, a system of astronomi- 
cal coordinates must be set up. These angular coordinates relating points on 
the celestial sphere, a unit sphere with points representing directions in 
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physical space. Positions 


of the “fixed” stars are e; 
nearly constant on the 

celestial sphere, so they i 
are good points of refer- 


ence. Celestial 


The first step in set- | Sphere 
ting up a coordinate sys- 
tem is selection of a 
convenient orthonor- 
mal frame {o,} of fixed 
reference directions. A 
pole vector o, is chosen 
normal to a_ reference 
plane, which intersects 
the celestial sphere in a 
reference equator. The 
vector g, is an arbitrary 
direction in the reference plane, usually chosen as the direction of an easily 
identifiable star for observational convenience. Then, of course, ig, 1s deter- 
mined by a, = a, X a,. 

The osculating orbit of a satellite projects to a great circle on the celestial 
sphere. The Kepler frame {e,} of the orbit is related to the reference frame 


{o,} by 
C= hak. (2551) 


The attitude element R can be parametrized by a set of Euler angles y, ¢, Q, 
as indicated in Figure 2.3; from Section 5-3 we have 


e(/2yim ei Vee Qa) 


Let us refer to these three angles as Eulerian (orbital) elements. To facilitate 
discourse, it is convenient to introduce some additional nomenclature from 
astronomy. 

The angle cis the inclination of the orbit (0 < « < 180°). The orbital motion 
is said to be direct if 0 < t < 90° and retrograde if 90° < « < 180°. The vector 


o, Xe, 


Za i 
se = 
es Plane 
1 - ve 


Projected Orbit 


Fig. 2.3. Coordinates for the orbit of a satellite projected on 
the celestial sphere. 


R= e (2a e (ian e (V/2Nanse = @ (2a 


= ——— = 6 e (2.53) 
lon Xe, 
is the direction of the ascending node where the orbit crosses the reference 
plane. The angle ©2 is the longitude of ascending node. The angle y is the 
argument of pericenter. 
For observational astronomy, the most practical coordinate system is the 
equitorial svstem, a geocentric system in which the celestial north pole oe, is the 
point where the Earth’s axis penetrates the celestial sphere. In this system the 
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path of the sun is a great circle on the celestal sphere called the ecliptic. The 
ecliptic cuts the celestial equator at two points called the equinoxes. The vernal 
equinox is the ascending node of the ecliptic, and this is taken as the reference 
direction 6,. There is more special nomenclature for the equitorial system 
which need not concern us here. 

Another widely used coordinate system is the ecliptic system, in which the 
north pole is the direction of angular momentum vector of the Earth’s orbit 
about the sun and 4, is also taken to be the direction of the Vernal equinox. 
The trouble with the equitorial and ecliptic systems is their reference direc- 
tions are not truly constant, as we shall see. 

Conventional perturbation theories use Eulerian elements to specify the 
attitude of an orbit, and so develop equations of motion for these variables. 
In contrast, use of the spinor element R enables us to develop the theory and 
applications without prior commitment to a particular set of angle variables. 
This has important practical as well as conceptual advantages: (1) A single set 
of Eulerian elements cannot be used for orbits of arbitrary eccentricity and 
inclination, because their equations of motion become ill-defined for some 
values of the parameters. (2) The variables which provide the simplest 
solution of the attitude equation of motion depend on the perturbing force, 
and they are not necessarily Euler angles. (3) Variables suitable for handling 
one perturbation may not be optimal when many perturbations are consid- 
ered together. 

Although we eschew the use of Eulerian elements in our perturbation 
theory, when the functional form of the attitude spinor R has been deter- 
mined for a particular problem, we may wish to express the results in terms of 
Euler angles for practical reasons. That can be done by solving (2.52) for the 
Euler angles in terms of R. The derivatives of the Euler angles related to the 
rotational velocity of a Kepler frame by 

= = RtR=¢6,Q+nit+ey, (2.54) 


a result derived in Chapter 7. This can be solved to get the variations in the 
Eulerian elements from w. Thus, multiplication of (2.54) gives us 


ee (2.55a) 
and 
oXn=6,XnQ+e,Xny, 
whence, 
5 o(nXx e,) _ CE a OSM 2505 (2.55b) 
n-(e, X 46,) sin’ t 
ue @r(m X 63) _ _@"e, — Wa; COS t_ (2.55c) 


n-(a, X e,) sin’ t 
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Referring to Figure 2.2 for interpretation, a time variation of the inclination ¢ 
is called nutation of the orbit. A variation of Q is called precession of the 
nodes or longitudinal precession. A variation of y is called, as before, 
precession of the apses or major axis. Either type of precession is said to be 
direct if the angle increases or retrograde if it decreases. The orbital tilt 
mentioned previously in the discussion of Equation (2.15) is a combination of 
nutation and longitudinal precession. Indeed, the decomposition of tilt into 
nutation and precession depends on the chosen reference direction. 
Conventional equations of motion for the Euler elements can be derived 
from (2.5S5a, b, c) after inserting the expression for w in terms of the 
perturbing force given by (2.15). But we have no need for those equations. 


8-2. Exercises 
Ce) Begin with the general three body equations (6-5.2). Let particle 1 


be the primary with particle 2 as its satellite and particle 3 a 
perturbing body. Derive the perturbed 2-body equation 


Le 


r 
pea 
: 


, 


where uw = G(m, + m,), r = x,—x,, r’ = x,—x,, and the exact 
perturbing force is given by 


r’-r ; 


Pak 


f = Gm, | 


lr—r'|* r 
Note that f is a definite function f(r, ¢) if r’ = r’(f) is a specified 
function of time. Show that f can be derived from the potential 


1 rr (it) 
g(r, t) = Gm,| — | 
Ir-r'(] v(t) 
(22) Evaluate A, in Table 2.1 for various values of . Is there a pattern? 
(263) Show that for a satellite subject to a central perturbing force 


f = -Adg(r), 


Cate + ov) 
u 


is a constant of the motion. Therefore, the eccentricity oscillates 
between maximum and minimum values depending on @(r). 
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8-3. Peturbations in the Solar System 


In this section we discuss the principal perturbations on satellites in the solar 
system and calculate their long-term effects to first order. The various pertur- 
bations can be classified into three main types: 

(A) Gravitational perturbations following from Newton's law have the 
largest effects. Two subtypes are particularly important : (1) Quadrupole 
perturbations due to an asymmetric mass distribution in a nearby body, (2) 
Secular third body perturbations due to a distant orbiting body. These 
perturbations have a wide variety of effects on the orbital and rotational 
motions of planets and satellites. 

(B) Non-Newtonian perturbations from Einstein's theory of Relativity have 
the most subtle effects, which can be identified only after the Newtonian 
effects have been accounted for with great precision. 

(C) Non-gravitational perturbations such as atmospheric drag, the Solar 
wind and magnetic forces are negligible for the planets and larger satellites, 
but their effects on artificial satellites and the smaller asteroids are quite 
significant. 

We will employ the secular perturbation theory developed in Section 8-2. 
so let us repeat the equations of motion for the orbital elements for casy 
reference: 


R=+Riw , (3. 1a) 
where 

A es Ee rae [evr X f)-h + hee |, (3.1b) 
and 

op =i (Gate) 


is a Kepler frame, with e, = @, e, = e, X e,, e, = h. And 


hh = ihre ft), (322) 
pee = e-[v X (r X f)] + (h X e)-f (3.3) 
E=vf. (3.4) 
Also, we need the relations 
._ ple’? -1) i. age 
a a ae a 


where a is the semi-major axis and 7 is the period of the osculating elliptical 
orbit. The variable v can be eliminated in the orbital averages by using 
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v= net fh! = 5% (hx e+ hx é). (3.6) 
In particular, in (3.1b) it is convenient to use 
ey= 4 (e X h)-f = —- becky a Ga) 


We will refer back to Section 8-2 for calculation of the averages in these 
equations. But note that if f = —VV is a static conservative force, then 


ee Ai ——— aa a: 
E=vf Tr 0 (3.8) 


Therefore, such a force can change the shape but not the size of the orbit. 


Oblateness Perturbations 


The Earth’s oblateness has significant effects on the orbits of artificial satel- 
lites near the Earth, and an evaluation of the effect of the Sun’s oblateness on 
the orbit of Mercury is needed for testing Einstein’s theory of gravitation. To 
be more specific, the main effects of the Earth’s oblateness on a near satellite 
are more than a million times greater than effect of the Moon. Oblateness 
produces one of the two main perturbations on near satellites; the other is 
caused by the Earth’s atmosphere. These two perturbations can be consid- 
ered separately, because their effects are different in kind. Anyway, in first 
order perturbation theory, the effects of different perturbations are simply 
additive. 

In Section 8-1, we found that oblateness of an axisymmetric planet pro- 
duces a quadrupole gravitational force (per unit mass) on a satellite with the 
explicit form 


f=- “ {1 — S(u-F)?)r + 2r-uul, (3.9) 


with 
k= = ere : (3.10) 


where r, is the equitorial radius of the planet, and u is a unit vector along the 
planet’s symmetry axis. 

We derived the force (3.9) from a potential, so (3.8) tells us immediately 
that the Kepler energy EF is a secular constant of the motion. To determine the 
secular equations for the other orbital elements, we need the following orbital 
averages which are easily calculated from Table 2.1 1n Section 8.2. 


PRE = -2K[ FE | x SSG (aen1) 
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tona((F]-o( SE} 2(F) =) 


ag? BG acs 2ae? 
= {9 — 58 aut + ud + 53 ui | 


kag? 7 


sar o (1 -7u?-7u3) (3.12) 
[SBE xB 2 2keuh| sore coe 
ae r 
2kwerh? 
= as (u2—u2), (3.13) 


where u, = u-e,. 

Insertion of (3.11) into (3.2) tells us at once that h, like E, is a secular 
constant of the motion, so € must be constant as well. Therefore the satellite 
orbit does not change size and shape under an oblateness perturbation; it only 
rotates rigidly in space with its focus at the center of the primary as a fixed 
point. In other words, the secular effects of oblateness are entirely deter- 
mined by the secular rotational velocity @. Inserting the orbital averages 
(3.11), (3.12) and (3.13) into (3.1b), we get 


Baan. = >(u x e,)’Je, — e,-uu } , (3.14) 

where 
a 1/2 2 
oe S| = oo |2| (3.15) 
hb? 2 4 u TO! P \ a 

The equation of motion for the attitude spinor can therefore be written 

R=+Riwo =+io,R ++Rio, (3.16a) 
where 

o, = 2-2 (u Xe] = 4 [5(we,)’- lo, (3.16b) 
and 

@, = —xe,:uu. (3.16c) 


Since w, and w, are constant vectors, Equation (3.16a) integrates to 
R = eW2yiat R, ef 1/2 )ieort (3.17) 


where R, is the initial value of the attitude spinor R. 
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Equation (3.17) specifies the attitude of the orbit at any time. We have 
already met such an equation in our study of spinning bodies, and that 
experience is helpful in interpreting it. Since @, is constant, the orbital 
angular momentum vector h = he, undergoes steady gyroscopic precession 
about the symmetry axis ®@, = u of the oblate 
primary. This is the same thing as longitudinal Pole 
precession of the ascending and decending nodes, 
as shown in Figure 3.1 for the case of an 
artificial Earth satellite. Using (2.55b), we find 
that on one revolution the nodes precess through 


an angle ee 
a 


AOS TOS Tome 


=- a 
(Le): 


2 

ia cost. (3.18) 

a Fig. 3.1. Satellite orbit on the 

The negative sign means that the precession is elestial sphere showing regres- 
: : sion of the ascending node along 

retrograde. The precession rate decreases with : 

3 - i Sagal ‘ . the equator due to the Earth’s ob- 

increasing inclination and vanishes for a polar  jateness. 

orbit. Observations of this precession by arti- 

ficial Earth satellites provide precise values for J, and higher order harmonic 

coefficients J,, when included in the perturbation theory. 

The other vector @, in (3.17) gives the precession rate of the apses in the 
orbital plane. In one revolution the axis turns through an angle Aw = Tw,, as 
shown in Figure 3.2. According to (3.16b), the precession rate depends on the 
inclination of the orbit and vanishes at the critical angle 


i. = sin (1/-V 5) = 63°.43. (3.19) 


The precession is retrograde for larger Z a 


inclinations and direct for smaller inclina- eS =e 
tions. th ——L4Y V 

Note that the magnitude of the oblate- q 
ness effects decrease with increasing range Ss 
by a factor r3/a’, so they are quite negli- 


gible at the distance of the Moon. Fig. 3.2. Precession of the major axis 
(or perigee) due to oblateness. 


Secular Third Body Forces 


Consider two satellites orbiting a common primary. Suppose the primary is at 
rest at the origin and the orbit of the outer satellite is a given 2-body ellipse. 
The very long term effect of the outer satellite (the third body) on the inner 
satellite can be estimated by a method originally employed by Gauss. The 
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time interval must be long enough for the third body to make many orbital 
revolutions. If the periods of the two satellites are incommensurable, there is 
no correlation between their positions on their respective orbits, and the net 
third body potential at any position r is given by the time average 


— Gm, [T, dt 
This is the same as the potential : an elliptical ring formed by smearing the 
mass m, of the perturbing body over its orbit with a density proportional to its 
transit time. We shall refer to it as a secular potential to remind us to the 
special conditions for its applicability. 

We can estimate the time average in (3.20) by expanding the integrand in 


the Legendre series (1.18). Since r, = |r, | > r = |r|, we obtain 
#, = -Gm,|(=)+4-(2)+ 3(G2)- | =)+..,| 
= -Gm, ee + 0+ a = sa + | 
= -Gm, | + a oe +.) (3.21) 


where the subscripts indicate parameters of the third body orbit, and u, is the 
unit normal of the orbital plane. To lowest order, the secular gravitational 
field of the perturbing body is 


_ se 
g(r) = -VG, = are (r — 3(r-u,)u,]. (3.22) 


It may be surprising to get an axially symmetric field. The field is independent 
of the- alignment of the major axis for two reasons: the origin ts at a focus 
rather than the center of the ring, and the mass density on the ring increases 
with distance from the origin at a rate just sufficient to cancel the 1/r fall off in 
potential. Note that the field strength for an elliptical ring is greater than that 
for a circular ring with the same mass and major axis, since b,, = ay, (1 — &). 

At points which are closer to the ring than the origin, the Legendre 
expansion coverges so slowly that a great many terms are needed to approxi- 
mate the secular potential accurately. A more efficient method is available 
when the ring is circular. In that case the potential can be evaluated explicitly 
in terms of elliptic integrals, but a more elementary approach will suffice here. 
We suppose that the satellite orbits are coplanar, so we need to evaluate the 
secular field only in that plane. It will be convenient to calculate the field 
directly instead of indirectly by calculating the potential first. The secular field 
at r is given by 
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g(r) = = | ‘i do ieee (3.23) 
where the true anomaly has been substituted for the time variable. The 
circular orbit can be parameterized by a, = | r, | and 

tte, * (3.24) 
where i is the unit bivector for the orbital plane. On the other hand, 

r,—r = Rte¥ (25) 


is the simplest parameterization of the relative position variable (Figure 3.3.), 
enabling us to put (3.23) in the form 


ae Gm, , {2% iy 
a (f= ag Fl doe (3.26) 


This suggests that y would be the most appro- 
priate integration variable. The variables are 
related by 


Pe 7 Re (3.27) 


which we obtain from (3.24) and (3.25). To 


express R as a function of y, we eliminate @ Fig. 3.3. 
from (3.27); thus, 


a, =r? + R’ + 2rR cos wp. 


Coplanar orbits with a 
common primary. The outer orbit is 
circular. 


The positive root of this equation gives us 
R= R(w) = [a,-r’ sin’ y]'’-—rcos wp. (3.28) 
It will be convenient to define 


R, = Rp +m) = [a2 - 1? sin? yp]. 


Note that 
RR =Ge—r , 
R,-R=2rcos y, 
R,+R=2(a—-r’ sin’ yp)’. (3.29) 


The differential for the change in variables from @ to w is obtained by 
differentiating (3.27). After some algebra, we get 


R [a —r’?sin? yl” R+R 


(3.30) 


Now we are prepared to derive an explicit formulation of the integral in 
(3.26); thus, 
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Bs ev _ | 2 dy ev =n I dy ei” (4- 1 
0 R o0 R+R, R ny I Sey R R, 


2r x dy cos’ w 
plese sina) = 


Therefore, our expression for the secular field can be written 


dw cos? sO COR (3.31) 


£ (r) = — a 2 1/2° 
y A caerarrar + (r/a,)’ sin’ y] 


The integral can be evaluated in terms of elliptic integrals, but we will be 
satisfied with a binomial expansion of the denominator to get 


he dy cos* y 
Kr) = a COR ia) sine wy" 


-{" ay cost y | 1+4(4) sin y+3(2] sint y+. | 
0 a a 


-S[r4i(t) +a(4}+....] (3.32) 


This should be compared with the expansion 


pate eta la) 4 


That shows that the second order term in (3.32) is essential for second order 
accuracy better than ten percent. Inserting these expansions into (3.31), we 
get 


— Gm ae |" ee, rai 
g(r) = —* ffi-2{— +e(4] real (3.33) 
‘a 2a; ane we ee 


To first order this agrees with the result (3. 22) from the Legendre expansion, 
except for a factor (1 - €7) “* = 1 + 3 €7 which we could include in (3.33) to 
account for a small eccentricity in the orbit. 


Solar and Lunar Perturbations of an Earth Satellite 


To investigate the influence of the Sun on the orbit of an Earth satellite, we 
adopt a geocentric reference system. In this system the Sun orbits the Earth, 
and thus generates a secular gravitational field which, according to (3.22), is 
given in the vicinity of the Earth by 
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gs = Se lr - Sera, (3.34) 


where M, is the mass of the Sun, u is the pole vector of the ecliptic, and ), ts 
the minor axis of the Sun’s orbit about the Earth, so b, = a, (1 - €.). This ts 
the average field of the Sun over a period of one year, So it is iia for 
calculating secular perturbations over one or many years. 

The secular rotational velocity of the orbit @ is calculated from (3. Ib), with 
the help of Table 2.1 to compute the averages. Writing 


@ = w,e, + a, (3:35) 
we find 


1 —— eS h — 
wo, = Fe oot X £s)e; — ye €,'s. 


GM, ———_—— he —— 
= 3 <6.) ee 

whe {—3e,f u-rr:(u X e,) oe 3r-uu]} 
= S {+ a’e(1 — €?)[(u X e,)? — 2(u-e,)’] 


—a(1 — €°)[- + ae(1 — 3(u-e,)"}}. 


[1 a0 (u x e,)’ — S(u- Se lk 


Using h* = GMa(1 - €°) and b* = a*(1 — €°), it will be convenient to write 
this in the form 
w, = x[1 + (u X e,)’—5S(u-e,)’], (3.36) 


where x is defined by 
SS iS n: 
ee 4\ ai J\GM Gee a 


where 7, is the orbital period of the Sun and T is the orbital period of the 
satellite. Also, we calculate 


3 GM, —— 
O,= ire, X x ee ee Nee ee 3 
5 e, X (r X g,) ey gr 5 ae 


: : [u(1 — €°) + Su-selure, 


which we can write as 
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, = WU + Wye, (3.38a) 
where 

W,, = —HUe, (3.38b) 
and 

Wy = -| se) Le ue (3.38c) 


Inserting (3.38a) into (3.35) we have 
@ = we, + Wye, + w,U. (3.39) 


Note the special form of this vector; u is constant and e, = Rto,R. This 
observation and our experience with Euler angles enables us to formally 
integrate the attitude equation 


R = +iRo, 
with the result 

eee ee ee (3.41) 
where 


6-6 =|" 0,4 


9-0,=[' wy a 
0 


Y— Wo =|’ ow, dt. (3.42) 


We can interpret (3.41) as follows: The third factor on the right side describes 
precession of the orbital plane about the pole vector of the ecliptic u with an 
angular velocity w, = g; so ¢ is the angle of precession. The second factor 
describes the inclination of the orbital plane and nutation about an average 
inclination if w, is periodic, as it happens to be in (3.38c). The third factor 
describes precession of the apses in the orbital plane at the rate w, = w. 

The precession of the nodes which astronomers observe is actually a 
combination of the second and third factors, as can be seen by comparing 
(3.39) with (2.54). The parametrization used here is better than the conven- 
tional one, because it simplifies integrations. 

Let us see what these results tell us about the motion of Earth’s most 
important satellite, the Moon. The expressions (3.38b) for w, tells us that the 
precession is retrograde and constant if the eccentricity in (3.37) is constant. 
Using the data €, = 0.0167 and € = 0.0549 for the orbital eccentricities of Sun 
and Moon respectively, we find 
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[8 

1l-¢ 
Then, using (3.37) and the empirical value 6, = 5.14° for the average inclina- 
tion, we estimate that the period of precession of the Moon’s orbit is 


2t _ an _ 4 ( 365.25 yr \ 1.0041 
yew 0.996 


3/2 
| ~1+3(e- 2) = 1.0041. (3.43) 


= 7298" YI. 3.44 
= er COS U,emed | : yr Ce 


This is reasonably close to the observed value of 18.61 yr. For we can estimate 
the accuracy of our approximation by recalling that, in the Legendre expan- 
sion used to calculate the secular perturbing force, the neglected third order 
terms are smaller than the second order terms by the ratio of Earth-Moon 
distance to Earth-Sun distance, that is, by 


ees 3.844 x 10° km 
a. 1495 <x 10" kin 


= 0.00257. (3.45) 


This is of the same order of magnitude as the eccentricity correction (3.43), so 
third order terms would be needed for an accurate calculation. 

The inclination of the Moon’s orbit with respect to the ecliptic is nearly 
constant with a value of 5°.14. So, in the equitorial system of an observer on 
Earth, the steady precession of the Moon’s orbit will appear as an oscillation 
in the inclination of the Moon’s orbit over a range of 10°.28 with a period of 
18.61 yr. 

To estimate w,, we use (3.41) to write (3.36) in the form 


w, = x[1 + sin’ 6 — Ssin’ sin? yl. (3.46) 


Estimating @ by its average value 6, = 5.14° and using the average value 
sin’ y = 5, this gives us 


w, = x{1—+ sin? 6,] = x(0.9880). (3.47) 


So, for the period of precession of the Moon’s perigee, (3.44) yields the 
estimate 
7h 18.08 yt. (3.48) 
W, 
This is about twice the observed value of 8.85 yr. In view of our estimate 
(3.45), this discrepancy appears to be too large to be accounted for by a third 
order correction. It suggests, rather, that a factor of 2 has been overlooked in 
the calculation. The matter deserves to be examined more carefully. 
To estimate the amplitude and period of the secular nutation of the Moon’s 
orbit, we use (3.41) to write (3.38c) in the form 


Se7x | sin 20 
l-¢& 2 


fie Ce -( sin y. (3.49) 


Because of the small coefficient on the right, it is permissible to replace @ by 
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its average value and assume that y = w, is constant, so (3.49) is easily 
integrated to get 
=) 


6-8 == a : 
6, re eas y (3.50) 


€° sin 4) x 
2 


This tells use that the nutation has period 2z/w, and if we use empirical value 
w, ~ 2x, for the amplitude of the nutation we obtain the estimate 


5 ¢? sin 6, = + (0.0030) (0.1785) = 0.0007 rad, (3.51) 


or 2.4 sec of arc. Of course, this effect is much too small to be observed in the 
approximation we are working with here. But it gives us a preview of an effect 
to be expected in the next approximation. 

It should be remembered that we have been considering only long-term 
perturbations of the Moon’s motion. Perturbation by the Sun also produces 
significant periodic effects of shorter term, including nutation with amplitude 
of.nine minutes, and a 40% oscillation in eccentricity. 

For long-term solar effects on a near-Earth artificial satellite our results will 
be much more accurate than for the Moon and especially significant since they 
apply for arbitrary inclination and eccentricity. However, the satellite will be 
perturbed by the Moon as well as the Sun. In fact, the lunar effect is more 
than twice as great as the solar effect, for an estimate of the ratio of their 
effects from (3.37) gives 


Hm Mu[ as)’ (1-es \= 
: Mu ( 2 | [a8 2.194, (3.52) 


This is the same as our earlier estimate for the ratio of lunar/solar tidal forces 
on Earth. 

To calculate the lunar-solar effects on a satellite, we simply add separate 
versions of (3.39) for Sun and Moon; thus, 


= (jg, + Wy), + (Opt Wyse; + (Wpylly, + Wpctls). (3.53) 


Most notable here is the fact that the axis of precession specified by the last 
term is determined by vector addition, and, of course, it will change slowly 
with time since u,, precesses about u, with an 18 yr period, as we learned in 
our study of the Moon. It should be noted that a second order calculation of 
lunar effects should not be expected to give better than 10% accuracy, for a 
bound on the accuracy is given by the ratio. 

Earth radius 1 


——$<—$_______——_ = — = 90.012. 3.54 
Earth-Moon distance 1 ( ) 


R 


Luni-Solar Precession and Nutation 


Since the Earth is oblate, the nonuniform gravitational fields of the Sun and 
the Moon exert torques on the Earth, driving changes in the direction of 
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Earth’s axis known as Luni-Solar precession and nutation. 
To calculate the long-term effects of the Sun on the Earth’s rotation we use 
the secular field (3.34). This field exerts a torque on the Earth 


= Jamexg,=-3 sah 


dm rr-u,] X uy. 


Using the expression (1.20a) for the inertia tensor %, this can be written 


3 on 


r, => 


(sn) X< a... (3.55) 


Then, using the — form (1.25) for the inertia tensor of an axisymmetric 
body, we get 


ro=- Sle I,Juy-ee X uy, (3.56) 


25 
2 
where e rnemons the symmetry axis of the Earth. To this we must add a 
completely analogous expression for the Moon to get the total Luni-Solar 
torque’, +8h,. 

Now, by the method developed in Section 7-3, we can reduce the equation 
of motion for the Earth’s angular velocity 2 to the form 


Kho, +T,,, 
or 

Q=exF GSB 
where F is defined by 


OH he GM, eM) 
=-—- — : ———— . + ! . 
| D \|( b; )we | ce a 


We know from our studies in Sections 7-3 and 7-4 that the solution of (3.57) 
can be quite complicated even when F is constant, but it is sensitive to initial 
conditions. Fortunately, any large nutation that the Earth may have had as a 
result of initial conditions has long since been damped out by the time 
dependence of F. 

To extract the main effects from (3.58) we assume that u, is constant, 
though it actually precesses slowly due to perturbations by the other planets. 
We know from (3.44) that u,, precesses about u, with a period of 18.6 yr. 
Therefore, the direction of F oscillates and this drives a nutation of the 
Earth’s axis with the same period. This nutation with a period of 18.6 yr is 
called Luni-Solar nutation. The nutation is an oscillation about steady preces- 
sion superimposed on the Chandler wobble shown in Figure 3.5 of Section 
7-3. 


(3.58) 
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To study the steady precession, we separate it from the nutation by 
replacing u,, in (3.58) by its mean value over its period u,, = au,, where 


a = uy", = cos(5.14°) = 0.996. 
Introducing empirical values for the other parameters, we get 


3 a). M b 31GM 
F a 1 3 ; EB) ies} aaa M 
rl | i Juce ; ae (He |( Ps | | bs 


q 


=: paatllaee (= J 
69.87 s; Siew ee 
By Equation (3.50b) of Section 7-3, the period of precession for slow top is 
given by 
222-e 


Te = Ty 


(3.59) 


Here Q2-e = 2m day. Hence, for the period of Luni-Solar precession we get 
the estimate 


T,, = (69.87)(365.25)yr = 25521 yr. (3.60) 


This is reasonably close to the observed value of 25400 yr. The Luni-Solar 
precession is directly observable as a precession of the equinoxes. Note that 
the minus sign in (3.58) means that the precession is retrograde. 


Satellite Attitude Stability 


In Section 8-1 we found that an asymmetric body with inertia ¥ in the field 
of a centrosymmetric body with mass M is subjected to a gravitational 
torque 


r= au ex I. (3.61) 


3 


This is appropriate for investigating short-term effects of gravitational 
torques, in contrast to (3.55), which applies only to long-term effects. 

Note that the torque (3.61) vanishes if one of the principle axes is aligned 
with r. It is of interest of investigate the stability of such an alignment for 
orbiting satellites. Such knowledge can be used, for example, to keep commu- 
nication satellites facing the Earth. Fortunately, the simplest alignment is also 
the most significant, so we shall limit our attention to that case. 

We suppose that one principal vector e, is perpendicular to the orbital 
plane, so it is not affected by the orbital motion. Then we can write 
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Jt = Ie, cos 0— Le, sin 0, 


—sind. 


where @ measures the deviation of e, from f, so f-e, = cos 8 and f-e, 
It follows that 


f X Jf = e,(/, — 1.) cos @ sin 8, 


Also, if Q is the angular velocity of the satellite, then 42 = [,Q = 1,0e,. 
Therefore, Euler’s equation for the satellite reduces to 


3.GM 


1,6 = 73 


(7, — 1) sin 26. (3.62) 
For a circular orbit, we can write GM/r* = n’, where n is the constant orbital 
angular velocity. Then in the small angle approximation sin 26 = 28, it is 
evident that solutions to (3.62) are stable about 6 = 0 if J, < /, and unstable 
iy Bee a 

Assuming J, < J, and n*? = GM/r’ constant, we recognize (3.62) as the 
equation for a pendulum, for which we found the general solution in terms of 
elliptic functions in Section 7-4. For small displacements from equilibrium, 
the satellite oscillates harmonically with natural period 


_ 20 I, 
Li \ RED (3.63) 


This result is pertinent to the Moon. Its mean rotational period is equal to its 
orbital period, so it orbits with the same face (centered at one end of its 
longest principle axis) directed towards the Earth. A small perturbation 
would be sufficient to set the face in harmonic oscillation about this equili- 
brium motion. In principle, one could measure the period of this oscillation 
and obtain a value for (/, — /,)//, from (3.63). But the actual oscillations are so 
small and the period is so long that such a measurement was not feasible unit 
quite recently. However,the same quantity can be determined from the 
observation the forced oscillation which synchronizes the orbital and rota- 
tional motions. This is due to the fact that the Moon’s orbit is actually 
elliptical, so the factor r* undergoes small oscillations over an orbital period 
which force oscillations of the angle 6 according to (3.62). 

The equivalence of lunar spin and orbital periods is an example of spin- 
orbit resonance which exists throughout the solar system. It is believed to 
come about in the following way: The Moon loses an arbitrary initial spin 
gradually by tidal effects until it falls into a stable resonant state, where the 
rotational motion is sustained by parametric orbital forcing through the 
coefficient r~* in the torque. Mercury is in a 3:2 resonant state. Other 
spin-orbit resonances are found among the moons of the major planets, and 
some moons are coupled in orbit-orbit resonances as well. 
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The Advance of Mercury’s Perihelion 


We turn now to secular perturbations by the planets. As a specific example, 
we evaluate the secular perturbations of Mercury. We shall see that the main 
effect is a forward precession of Mercury’s perihelion. Historically, difficulty 
in accounting for this effect provided the first evidence for limitations of 
Newtonian theory, and the resolution of the difficulty was one of the first 
triumphs of Einstein’s General Theory of Relativity. More accurate data from 
the space program will undoubtedly make this a more stringent test of 
gravitational theories in the future. 

We aim to calculate the main effect on Mercury to an accuracy of a few 
percent. From the table of planetary data in Appendix C, we conclude that, 
to this accuracy, we can neglect the inclinations and eccentricities of the 
perturbing planets; for our experience with Solar perturbations has taught us 
that the relative effect of inclination is on the order of the sine of the angle, 
and eccentricity corrections are on the order of eccentricity squared. Accord- 
ingly, we can use (3.31) and (3.32) in a heliocentric reference system to 
describe the secular force f,, of the pth planet on Mercury. The force 1s a 
central force 


a) (3.64a) 
where f(r) has the explicit form 


fuer) = S| r Ji+o(4) +2(4] +... (3.64b) 


ie Na, =r SG, 64 \ a, 


Our perturbation equations (3.1) to (3.4) tell us that the only secular effects 
of a central force f is a precession of the apses at a rate 
h 


i eS 3.65 
W, = Ye fp A ( ) 
Computation of the orbital average f,, can be simplified by exploiting the fact 
that the radius r oscillates about Mercury’s semi-major axis a. To second 


order, a Taylor expansion gives us 


for) = fla) + ra) '@) + S™ pr@). (3.66) 
Inserting this into (3.64b) and computing the orbital averages 

f=-s, (rae =—+08, (r—a)T =- aes, 
we obtain 


f, = -#{f,(a) + zaf,’(a) + te°a" f,'"(a)}. 
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Although Mercury has a large eccentricity ¢ = 0.2056, the coefficient 

€7/4 = 0.0106 shows that the last term can be neglected in our approximation. 
Inserting (3.67) into (3.65) and using (3.64b) as well as the identity 

h F Qn a*(1 - pee 


u js GM, 


we get the secular precession rate in a form which is convenient for computa- 
tion: 


—- aa ; 
@, = (1— €*)"" = Upla) toe +o, (a)] (3.68) 
Here Tis Mercury's orbital _— and M, is the mass of the sun. In units with 
the Earth mass and Earth-Sun distance set equal to 1, the coefficient in (3.68) 
has the value 

(-eé)'"r @& (0.97864)z7 (0.381) 


Cat = Oeeebe OSI ( 365.25 x 10° 


an 
:& M, 87.969 day 3.329 x 10° en 


x (20.028 x yy He see | = 11853 See". are} 
rad cent 


where the units of radians per day have been converted to seconds of arc per 
century. To evaluate the remaining factor in (3.28), it is convenient to express 
2G 'f,(r) as the product A,(r)B,(r) of a dominant term 


m r 
An) = | 3:70 
Oe alee (3.704) 
and a correction factor 
Bi = 1+ 4(4) rece (+) (3.70b 
- 8 \ a, O4.\aae5) fe) 
Then, we compute the derivatives 
2 alle 2 
aA,'(a) = A, ( =e (3.70c) 
G4 
(fog ae. 
And we use these results in 
iz , i , 
C= G Weaats oi, [=Aaa yaB,, ees xaA, Be. (3.78) 


This quantity is evaluated numerically for each of the perturbing planets in 
Table 3.1, which also displays values for the various factors so their contribu- 
tions can be assessed. Values for the three most distant planets Uranus, 


TABLE 3.1. Data and Calculations 


Planct My tly ality, ae uA, B, i aB, C» Op 
1. Mercury 0.0554 0.38710 1.0 P 

2. Venus 0.815 (0.72333 0.53516 1.168 0.90135 1.0396 — E.0831 2.3600 279.73 
3. Earth 1.000 1.00000 0.38710 0.45533 0.67625 1.0198 1.0406 0.7878 93.38 
4. Mars 0.1075 1.52369 0.25405 0.01258 — 0.56900 1.0060 1.0167 0.01999 2h ai) 
5. Jupiter 317.83 $.20280) = 0.07440 0.87845 0.50539 1.0007 1.0013 1.3271 157.30 
6. Saturn 95.147 9.53884 0.05481 0.04798 — 0.50147 1.0002 1.0003 0.0720 8.53 
7. Uranus 14.54 19.1819 

8. Neptune 23} 30.0578 

9. Pluto 0.17 39.44 Sea 

Sun 3.329 x 10° 


Neptune and Pluto have been omitted, since it is evident that, by comparison 
with the small contribution of the larger and closer planet Saturn, their 
contributions are negligible. Figures in the last column are the calculated 
precession rates w, due to each of the five closest planets. For the total 
secular precession rate of Mercury's perthelion due to planetary perturbations, 
we obtain 


@= > a, = 541" 3/cent 


VMoe 


This is within 2% of the accepted value 531’’.S/cent, about as close as we 
should expect. 

For comparison with our results, results of the most precise and carefully 
checked calculations are displayed in Table 3.2 for Earth as well as Mercury. 
In the case of Mercury, the difference of 43’’/cent between the observed 
advance and the calculated planetary effects was recognized as a serious 
problem from the time of the first accurate calculations by Leverrier in 1859. 
Many theories were proposed to account for it. These theories are of two 
general types: (1) those in which Newton’s law of gravitation is retained, but 
the existence of an unobserved planet or ring of material particles inside 
Mercury’s orbit is postulated; (2) proposal’s to modify Newton’s law. More 
recently, the possibility that some or all of the discrepancy may be due to the 
oblateness of the Sun has been considered seriously. Einstein’s theory of 
General Relativity, proposed in 1915, accounted for the discrepancy in 
spectacular fashion. His theory provides an explanation of the second type, 
but it is distinguished from alternatives by (a) its lack of arbitrariness (no 
adjustable constants are introduced), (b) its accurate prediction of other 
phenomena such as gravitational deflection of light passing near the Sun, (c) 
the fact that it is a derived consequence of deep revisions in the foundations of 
mechanics. Observations have been sufficient to completely rule out theories 
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TABLE 3.2. Contributions to the Perihelion Advance of Mercury and Earth 


Precession rate 


Cause (are sec/century) 

Mercury Earth 
Mercury ENS) IG) Be 7S, 
Venus 277''.856 + 0''.68 345.49 + 0.8 
Earth 90.038 + 0.08 
Mars 2.536 + 0.00 97.69 + 0.1 
Jupiter 153.584 + 0.00 696.85 + 0.0 
Saturn HWY, ae WL)! 18.74 + 0.0 
Uranus 0.141 + 0.00 WLsi7/ ae (O10) 
Neptune 0.042 + 0.00 0.18 + 0.0 
Moon 7.68 + 0.0 
Sum 531.50 + 0.85 Sis 74'S) eee 72. 7/ 
Observed 574.09 + 0.41 1158.05 + 0.5 
Difference 42.56 + 0.94 4.6 + 2.7 
Relativity effect 43.03 + 0.03 3.8 + 0.0 


Reference: G. M. Clemence, Reviews of Modern Physics 19, 361 (1947). 


of the first type only recently, but the exact magnitude of the Sun’s oblateness 
effect is still uncertain. 

Although we cannot go into Einstein’s theory here, we can evaluate its 
implications for planetary motion. According to Einstein’s theory, the New- 
tonian gravitational force on a planet should be modified by adding the terms 

= 1G? Pav vie — uve) Boe 


f ~ 
Rel Cc 2 


where @ is the gravitational potential, and c is the speed of light in a vacuum. 
For a spherically symmetric Sun, @ = —p/r, so 


fret = ‘fou 5+ ay( vy L) ov 4] 
c ee r F 


II 


lu r (vf) = 
Seay ee 
A a ee (3.74) 


To evaluate its secular effects on the motion of a planet, we first note that its 
secular torque vanishes: 


aNG ui ay +)- 0. (3.75) 
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Therefore, like a central force, it will not contribute to secular precession or 
nutation of planetary orbits, and its effect on apse precession is completely 
determined by its secular average f,.,. 

Before computing the average, it is convenient to use v? = 2u/r + 2E to 
write (3.74) in the form 


(uli). i 


fe.) = Hawk +a ey ge 
: 


We can easily compute the average of each term with the help of Table 2.1. 
Since the orbital element E is to be regarded as constant when computing 
averages, the last term does not contribute, because 


For the first term, we compute 


(=) eee: 
- 2b2 
To evaluate the second term, we use v = wh *(h X € + h X £), which also 
implies v-f = wh *(h X e)-t. Therefore 
CD eee vibes 
ier {hx e(h x 0)-() + hx i 


The first term on the right vanishes, and to evaluate the second term we 
compute 


(hXe)rr hxXe 


ae 2ab 


Then, since u/h? = a/b’, we have 


(h X eé)-rr 
r’ 


yr _ # a WX yee) _ =e 
re eb 2ab alee 
Thus, for the secular value of (3.76), we find 
fw ues)! = calle B77 
Fret 2u(3£.) + 4| b } pe (3:77) 
Then, for the relativistic contribution to perihelion precession, we find 
a ORE a 
he ease. Tine \ che 
67 = au 67 GM, 


Spel Glbies  OT eels __ (3.78) 
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When empirical values are inserted, this gives 43’’/cent for Mercury and 
3’’.8/cent for Earth, as reported in Table 3.2. 

From observation of the Sun’s surface shape and rotation rate, the Sun's 
quadrupole moment J, is believed to be comparatively small, contributing less 
than one arc second per century to the advance of Mercury’s perihelion. A 
much larger J, would be inconsistent with Einstein’s explanation of the 
advance. Skeptics have pointed out that the Sun may be rotating more rapidly 
beneath its visible surface, thus producing a larger J,. However, it is not 
difficult to show that a J, large enough to contribute more than 8’'/cent to 
Mercury’s advance would be inconsistent with empirical data on precession of 
nodes and change of inclination for orbits of the inner planets. For we have 
seen that J, contributes to these effects while, because of (3.75), relativity 
does not. This issue will be set to rest when NASA completes its goal of 
determining the Sun’s J, accurately with observations on artificial satellites 
orbiting close to the Sun. 


8-3. Exercises 


@o1) The gravitational force of the Sun on the Moon is much larger than 
that of the Earth. How is it, then, that we can ignore the Sun ina 
first approximation when calculating the orbit of the Moon? Sup- 
port your explanation with order of magnitude estimates. 

(3.2) Use Equation (3.18) to calculate the rate of regression of the lunar 
nodes. Compare with the observed result of 6.6’'/yr for regression 
along the ecliptic. Take into account the fact that the inclination of 
the Moon’s orbit to the equator varies by about 10°. 

3) The greatest known oblateness effect in the Solar System occurs for 
Jupiter’s fifth satellite which is so close to the highly oblate (1/15.4) 
planet that the nodes regress more than 2} complete revolutions 
per year. The inclination of the orbit is 0.4°, and a = 181 x 10° km 
for the planet. Use this information to estimate the quadrupole 
moment J, of Jupiter. 

(3.4) The artificial Earth satellite Vanguard I launched in 1958 had the 
following orbital elements: 


Semi-major axis a = 1.3603 R,. 
Eccentricity € = 0.1896 
Inclination t= 34.26° 

Mean motion n = 3867.3 deg/day 
Period T = 4 = 134.05 min 


For Vanguard I and a synchronous (24 hr) satellite, calculate the 
oblateness and lunar-solar perturbations and compare with the 
results of L. Blitzer in the following table: 
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(3.5) 


(3.6) 


(3.7) 


(3.8) 


(3.9) 


Secular Perturbations 


Oblateness Moon Sun 
(deg/day) (deg/day) (deg/day) 
Q ~3.02 —0.00028 —0.00013 
Vanguard I 
y +4.41 +0.00039 +0.00018 
Q —0.057 —0,0030 —0.0014 
24-Hr Orbit 
y +0.084 +0.0042 +0.0019 


The estimated lifetime of Vanguard I is 200 yr. How much has its 
orbit been changed by the above perturbations since it was 
launched? 

At what distance from the Earth will Lunar-Solar and Oblateness 
effects on an Earth satellite be of the same order of magnitude? 
The Sun produces radial oscillations in the orbit of an Earth satellite 
which average to zero over the orbital period. To investigate the 
magnitude of this effect, suppose that the unperturbed orbit is a 
circle of radius r and the Sun is in the orbital plane. Derive a first 
order expression for the radial variation dr as a function of the time. 
Show that it is harmonic with period 7/2 and a maximum displace- 
ment 


we 
(6r) max = ne r, 


where n = 27/T = (GM/a’)'” is the ‘‘mean motion” of the satellite 
about the Earth and n, is the mean motion of the Earth about the 
Sun. Show that or ranges from less than one meter for a near Earth 
satellite to 2500 km for the Moon. Note the similarity of this effect 
to the tides. 
To account for the 43" discrepancy in the advance of Mercury’s 
perihelion, Seeliger proposed the ‘screened Kepler potential” 
g(r) = - SM ev 


for the gravitational potential of the Sun. Assuming d < r, deter- 
mine the value of d required. (Reference: N. T. Roseveare, Mer- 
cury’s Perihelion from LeVerrier to Einstein, Clarendon, Oxford, 
1982). 

Calculate the precession rate of Earth’s perihelion due to Mars, 
Jupiter and Saturn and compare with the results in Table 3.2. 
Equations (3.31) and (3.32) for the secular field of a planet apply 
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(3.10) 


(3.11) 


(3.12) 


(3.13) 
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only inside the planet’s orbit. Derive corresponding equations for 
the secular field outside the planet’s orbit. Use these equations to 
calculate the precession rate of Earth’s perihelion due to Mercury 
and Venus. Compare your results with those in Table 3.2. 

For a charged particle bound by a Coulomb force, determine the 
secular effects of perturbing constant magnetic field B, including the 
Larmor precession frequency. 

For a charged particle bound by a Coulomb force determine the 
secular effects of a perturbing constant electric field E. 

The Newtonian equation of motion for a planet under the influence 
of the Sun alone can be written 


. ip 
= -GMm = 
lf 


where p = mv. Einstein’s Theory of Special Relativity simply 
changes the expression for the momentum to p = myv, where 
y = (1-v’/c’)'’ and c is the speed of light. Adopting this change, 
show that the equation of motion can be written in the form of a 
perturbed Newtonian equation 


sp 


v=-GM, +f 
r 


where, since v* < c’ the perturbing force is given by 


y : GM,| r-v Sear 
eo Se (te = +—vyv—|. 
Sp y Mf ( y)v Cc r? 2 uv re 


How much does this contribute to the advance of Mercury’s perihe- 
lion? 

Show that for a constant perturbing force the orbital elements 
satisfy the secular equations of motion 


. . 3a . 3 
je, = (0). Dee aaa eco ts jay he 


Study the consequent changes in the shape and attitude of the 
osculating orbit. 
The averaging can be simplified by noting that 


fxh=fxX(rXv)=fvr-frv 


and 


fry + f-vr =O, 
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(3.14) 


(3.15) 


(3.16) 


Solar Wind. The intensity of Solar radiation at the mean Earth- 
distance from the Sun (the solar constant) is 


ese x 0am Ss. 


On an absorbing body, this produces a radiation pressure normal to 
the incident beam of magnitude 


= = 4.5 x 10° Nm 


where c is the speed of light. 

On the Earth satellites at altitudes above 800 km, the effect of 
radiation pressure is greater than atmosphere drag. The greatest 
effect has been observed on the 30 km ECHO balloon in its nearly 
circular orbit at an altitude of 1600 km. Assuming that the balloon is 
perfectly reflecting with area/mass = 102 cm’ gm', estimate its 
daily variation in perigee due to radiation pressure. Thus, show that 
radiation pressure can have a substantial effect on a satellite’s 
lifetime. Note that the Solar radiation force can be regarded as 
constant, so the results of the preceding exercise can be used. 
According to the drag paradox, atmospheric drag increases the 
speed of a satellite as it spirals inwards. Prove this statement using 
only very general assumptions about the drag force. 

From Section 3-4, we know that the atmospheric drag force on a 
satellite is of the form 


F, = F,¥, with F, =+C,o0AV?, 


where drag coefficient C,, ~ 2, @ is the atmospheric density, A is the 
effective cross-section area of the satellite, and V is the satellite 
velocity with respect to the local atmosphere. The simplest model 
atmosphere is spherically symmetric with a density distribution 
o = o(r) which falls off exponentially with distance, so 


CO One ae, 
where Q, is the density at some chosen level r = r, and ois a scale 
factor. 

Derive secular equations of motion for perturbation by drag 
alone. Averages over the exponential density can be expressed as 
Bessel functions, but a simpler approach, sufficient for rough 
analysis, is to expand o(r) about r = a before averaging over a 
period of the osculating orbit. 

Show that drag does not alter the orbital plane. Derive an 
expression for the secular decay in size of the orbit, and show that 
the orbit tends to become increasingly circular as it decays. (For a 
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detailed treatment of atmospheric drag, see D. G. King-Hele, 
Theory of Satellite Orbits in an Atmosphere, Butterworth, London, 
1964). 


8-4. Spinor Mechanics and Perturbation Theory 


This chapter develops a new spinor formulation of classical mechanics which 
has not yet been widely applied, so it is a promising starting point for new 
research. The spinor formulation of perturbation theory in celestial mechan- 
ics has clear advantages over alternative formulations, so we will concentrate 
on that. But the approach is not without interest in atomic physics as well, for 
one cannot help asking if the classical spinor variables have some definite 
relation to the spinor wave functions in quantum mechanics. The question has 
not yet been studied in any depth. Nor are the purely classical applications of 
spinor mechanics sufficiently well worked out to present here. So we will be 
content with a formulation of the general theory without applications. 


Position Vector and Spinor 


From our study of linear transformations in Chapter 5, we know that 
geometric algebra enables us to write any rotation-dilation of Euclidean 
3-space in the canonical form 


x’ = UTxU, (4.1) 


where x and x’ are vectors and U ts a nonzero quaternion (or spinor) with 
conjugate U'. This equation describes the rotation and dilation of any given 
vector x into a unique vector x’. The modulus |U, of the spinor U is a positive 
scalar determined by 


|U? = UtU = UU. 

Consequently. the spinor U, like any nonzero quarternion, has an inverse 
U7 = |u|? Ut. 

Equation (4.1) can now be written in the form 


x’ = |UP(U"xU). 


This exhibits the transformation as the composite of a rotation U 'xU anda 
dilation by a scale |U|’. 


We can use this result to represent the position of a particle by a position 
spinor U instead of a position vector r. We simply choose an arbitrary fixed 
unit vector 6, and write 


ean DA 8) Op (4.2) 
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This is just Equation (4.1) applied to a single vector rather than regarded as a 
linear transformation of the whole vector space. Squaring (4.2), we get 
pou Cds, 


mor |U 


2 (4.3) 


Thus, the radial distance r is represented as the scale factor | U| ° of a rotation- 
dilation. 

Although the position vector r is uniquely determined by the position 
spinor U according to (4.2), the converse is not true. Indeed, if S is a spinor 
such that 


S'6,S =o, (4.4) 
then (4.2) gives us 

r= U's U=VtoV, (4.5) 
where 

V=SU, (4.6) 


and S$ is arbitrary except for the condition (4.4). The condition (4.4) simply 
states that o, is an eigenvector of the rotation $'xS. In other words, S may be 
any spinor describing a rotation about the o, axis, so it can be written in the 
parametric form 


S = eine (4.7) 


where @ is the scalar angle of rotation and / is the unit pseudoscalar. 

Let us refer to the transformation (4.6) of U into V as a gauge transform- 
tion, because it is similar to the gauge transformation of a spinor state 
function in quantum theory. We say then that Equation (4.2) is invariant 
under the one-parameter group of gauge transformations specified by (4.6) 
and (4.4) or (4.7). If Equation (4.2) 1s regarded as a linear transformation of 
the vector 6, into r, the gauge invariance simply means that this transforma- 
tion is invariant under a rotation about the radial axis. We suppose that 6, is 
some definite unit vector, though the choice is arbitrary. Given o,, by 
Equation (4.2) a spinor U determines a unique vector r, but the vector r, 
determines U only up to a gauge transformation. This nonunique correspon- 
dence between spinors and vectors 1s to be expected, of course, because it 
takes four scalar parameters to specify the quaternion U but only 3 parame- 
ters to specify the vector r. To associate a unique spinor U with the vector r, 
we must impose some gauge condition consistent with (4.2) to fix the gauge 
uniquely. A natural gauge condition appears when we consider kinematics. 


Velocity Vector and Spinor 


Let r = r(t) be the orbit of a particle in position space, so h = r X r is the 
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angular momentum (per unit mass), and 
if rar 4 r) — Fr (4.8) 


The kinematic significance of this quantity will become apparent in the 
following. 

Equation (4.2) relates an orbit U = U(¢) in spinor space to an orbit r = r(¢) 
in position space. We still need to relate the velocity U in spinor space to the 
velocity r in position space. Differentiating r = |U|’, we obtain 


P=00 PU =200)>.. (4.9) 
Next, it will be convenient to introduce a quaternion W defined by 
W740 = oUt U9 oie. (4.10) 


where @ is a vector and (4.9) has been used to determine that (W), = r 'r. 
We can put (4.10) in the form 


U=+UW, (4.11) 


from which we obtain Ut = +W'‘U’. If we insert these expressions into the 
equation 


r= Ute,U + Ute,U 
obtained by differentiating (4.2), we get 

r=>(Wr+rW). (4.12) 
Using (4.12) this can be written 

rt = ff + 5 (re or). 
Then using rm = r-@ + i(r X w), we obtain 

r=rf+oxr. (4.13) 


Thus, we identify w as the angular velocity of the orbit r = r(t). 
According to (4.13), the radial component of w is irrelevant to r. Hence, 
we are free to eliminate it by introducing the subsidiary condition 


wr = (or), = 0. (4.14) 
This condition can be written in several equivalent ways; thus, 

@r = -rw 
or, 

Wtr=rW, (4.15) 
which after inserting (4.10) and (4.2), gives us 

6 U=UtaU. (4.16) 
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This is equivalent to the scalar condition 
10,0) = ia, UU!) = 0. (4.17) 


Thus, we have expressed the subsidiary condition as a relation between U and 
its derivative U. 


Now, using (4.15) in (4.12) we obtain 
r=rw. (4.18) 
Solving for W and using (4.10), we get the fundamental result 
V2 —rer =r r+ io. (4.19) 
Comparison with (4.8) shows us that the angular velocity is related to the 
angular momentum by 
o=r~h. 
This is a consequence of or, if you prefer, an alternative form of the 
subsidiary condition (4.14). 
Equation (4.19) specifies completely our desired relation between U and r. 


Various special relations between U and Ff are easily derived from it. For 
example, 


Sed 


Wea Wwe ee (4.20) 
r a 


Useful alternative forms of (4.19) are obtained by multiplying it by r and 
using (4.12). Thus, we obtain 


P= lU'eU, (4.21) 
or, equivalently, 
2rU = 6,08 (4.22) 


The subsidiary condition (4.17) is a gauge condition. To see how it deter- 
mines the gauge, consider an arbitrary time dependent gauge transformation 
V = SU. We wish to relate V to U to determine the effect of the gauge 
transformation. Differentiating (4.7) with @ = @(t), we have 


S$ =+i6,oS = S+io,. (4.23) 
So, using (4.11), we have 

V = SU + SU = 7 (io,pV + VW). 
With the heip of (4.5) we can put this in the form 

/=+V(ith + W). (4.24) 


This is a completely general relation showing how W can be altered by a 
gauge transformation. Using the specific form (8.19) for W, we obtain 
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2V71V =2U7U + ith = rr + i(r7h + FO). (4.25) 
This shows explicitly that the gauge transformation adds a radial component 
fr to the angular velocity. We can solve (4.25) for @, with the result 


5 = —Ci#2V'V), = - 2KiV10,V),. (4.26) 


This reduces to the subsidiary condition (4.17) if and only if @ = 0. Thus, the 
subsidiary condition fixes the gauge to a constant value. In other words, the 
gauge can be choosen freely at one time, but its value for all other times is 
then fixed by the subsidiary condition. 

We have proved that any alternative to our gauge condition will have, in 
general, an angular velocity with a nonvanishing radiat component. Equation 
(4.13) shows that a radial component of the angular velocity will not affect the 
velocity r in position space, so we are free to adopt alternative gauge 
conditions. A physically significant alternative will be discussed later. 


The Spinor Equation of Motion 


The spinor acceleration corresponding to the acceleration vector in position 
space is most easily found by differentiating (4.22). Thus, 


Z “(r) = 6,Ut + o,Ur 


= UU 'e Ui + «| iets je 


Pip 


Hence, 
2 = Sire aay (4.27) 


where d/ds = rd/dr. Thus, in spinor space it is natural to introduce a new time 
variable s related to inertia! time t¢ by 

dt 4 

—=r=|U)’. 4.28 

= r=|U| (4.28) 
Now we have a complete system of equations relating position, velocity, 
acceleration and time variables in position space to corresponding variables in 
spinor space. These are general kinematic results, enabling us to transform 
any problem or relation from position space to spinor space or vice-versa. 

Given the vector equation of motion in position space 


ae (4.29) 
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where f is an arbitrary perturbing force (per unit mass), the spinor equation of 
motion is obtained by substitution into (4.27). Thus, we obtain 


250 — EU = Url = r6,UN), (4.30) 
where E is the Kepler energy 
e=te-H=\up(2| au ion, i 
sais ali =| 7H (4.31) 


The spinor equation of motion (4.30) becomes a determinate equation in 
spinor space when f is given as an explicit function of r and rf so rf can be 
expressed as a function of U and U by using (4.2) and (4.21). It can be solved 
subject to the subsidiary condition in the form (4.16) or (4.17). The subsidiary 
condition can be shown to be a constant of motion, so if it is imposed initially, 
it is automatically maintained for all subsequent times. Note that the pertur- 
bation factor rf = r-f + i(r X f) in (4.30) decomposes naturally into a radial 
part r-f which can alter the size and shape of the osculating Kepler orbit and a 
torque i(r X f) which can alter the attitude of the orbit in space. This is 
closely related to the alternative gauge condition discussed below. 

The spinor equation of motion (4.30) was first derived in a more compli- 
cated form by P. Kustaanheimo in 1964. Kustaanheimo and E. Stiefel recast it 
in a matrix form, which is now known as the KS equation. Geometric algebra 
has enabled us to further simplify the derivation and formulation of the 
equation as well as clarify its interpretation. For example, it helped us see the 
elementary kinematic meaning of the subsidiary condition (4.17), which was 
never recognized in the matrix formulation. 

Stiefel and Schiefele (1971) have shown that solving the perturbed Kepler 
problem by integrating the KS equation is numerically more efficient and 
accurate than standard methods for integrating the Newtonian equation of 
motion. Therefore, we can confidently expect no lesser advantage from 
developing the theory for integrating our spinor equation (4.30). We could, of 
course, simply translate the integration methods of Stiefel and Scheifele into 
our language. But we could probably do better by developing new methods 
which exploit the special advantages of geometric algebra. That is a task for 
the future. 

To see what has been gained in the transformation from vector to spinor 
equation of motion, we briefly examine solutions of the unperturbed spinor 
equation, which we can write in the form 

Ci oe (4.32) 
2 
where the primes represent differentiation with respect to s. For the case 
E <0, this has the mathematical form of the equation for a harmonic 
oscillator with natural frequency 
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w, = (-E/2) "2. (4.33) 


So it has the general solution 


U = U, cos w,s + 2h, sin @,S. (4.34) 

y 
where U, and Un are the initial spinor position and velocity. We can evaluate 
U, and On in terms of the initial position and velocity vectors r, and r,, using 
(4.2) and (4.21). The evaluation is simplified if we use our prior knowledge 
that motion lies in a plane. We are free to choose o, = f,; then the rotation 
rf = U 'o,Uis confined to the orbital plane, and (4.2) can be put in the form 


r=o6,U’. (4.35) 
Therefore, 

U=(¢e,r)'’, (4.36) 
and, in particular, 

U(r) =f. (4.37) 
Similarly, (4.22) and (4.28) give us 

U'=+6,Ur, (4.38) 
and, in particular, 

U, =F Fh ko. (4.39) 
Therefore, the solution (4.34) can be put in the form 

U= rie (cos OS + i rr, sin ws]. (4.40) 
Of course, from (4.33) the value of w, is determined by 

oy =4( 24-2) , (4.41) 


Now that we have U = U(s) as an explicit function of s determined by the 
initial conditions, our solution will be completed by integrating dé/ds = |U|? 
to get s as a function of f. We shall see that, in fact, this last step is equivalent 
to solving Kepler’s equation. 

Additional insight into the spinor solution is gained by writing it in the 
alternative form 


U= U_,eims ae U elms (4.42) 


where i = io, 1s the unit bivector for the orbital plane. From (4.35) it follows 
that the bivector part of U must be proportional to i. Inserting (4.42) into 
(4.35) we get 
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r= 6,[U2e2 + U2e +. 2U,U]. (4.43) 


This should be compared with the parametrization of r with respect to the 
eccentric anomaly @¢: 


r = aé[(cos g-€) + i(1—e€7)'’ sin @]. (4.44) 


If we choose o, = é, then the comparison tells us that 


U.=+ (s}" (1+ (1 —e2)7y (4.45) 


~ = 205. (4.46) 


Equation (4.45) tells us how U. are related to the standard orbital elements. 
Of course U, and U are alternative orbital elements appropriate in the spinor 
theory. 

Equation (4.46) tells us that the parameter s differs from the eccentric 
anomaly only by a scale factor 2, = (—2E)'*, which is itself an important 
orbital element. Thus, the eccentric anomaly appears naturally in the spinor 
theory, in contrast to its rather ad hoc introduction in the vectorial theory 
through Kepler's equation. Since the parameter s is equivalent to the eccen- 
tric anomaly when E <0, we can be sure that the Equation (4.28), 
dtids = 2w, dt/ds = |U ~, integrates to Kepler's equation, so we need not 
discuss its solution here. However, Kepler’s equation applies only when 
E <0, whereas (4.28) applies also when E = 0 or E>0O. Thus, s is a 
universal parameter, generalizing the eccentric anomaly to apply to all cases. 

It is readily verified that the solutions (4.40) and (4.42) apply also when 
E > 0, provided one understands that the imaginary root of (—E£/2)'~ is a 
bivector, namely, «, = (—E/2)'* = i(£/2)'*. Stiefel and Scheifele show that 
the solution can be cast in a form which applies also when E = 0. Then we 
have a universal solution of the spinor equation which applies for any energy. 
This is important, for perturbations can change the sign of the energy, so one 
does not want solutions which break down when that happens. 

The striking thing about the unperturbed spinor equation 2U"' —- EU = 0 
is the fact that it is a linear differential equation. Thus, the change in variables 
from vectors to spinors has linearized the Newtonian equation fF + pr/r* = 0. 
Moreover, it has eliminated the singularity at r = 0, where r~ becomes 
infinite. The elimination of a singularity in this way is called regularization. 
Regularization has real practical value, for it eliminates the instabilities 
(errors) in numerical integration that occur near a singularity. This 1s compu- 
tationally important in close encounters between celestial bodies, such as a 
comet grazing the Sun. 

To sum up, the universality, linearity, and regularity of the spinor formula- 
tion are three major reasons for its computational superiority over the 
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standard vectorial formulation of the general two body problem, and this 
becomes more significant when perturbations are included. 


An Alternative Gauge Condition 


We have seen that the spinor state function U is related to any acceptable 
alternative state function V by a gauge transformation V = SU. According to 
(4.5), U and V determine the same orbit r =r(t). As a geometrically signifi- 
cant alternative to the gauge condition (4.16), consider 


Vo 6Vo ner ih (4.47) 


where o, is an arbitrarily chosen fixed unit vector orthogonal to 6,. Equation 
(4.47) is consistent with (4.5) since hr = 0. Therefore it is acceptable as a 
gauge condition. 

The condition (4.47) has a number of advantages. To begin with, it assures 
that V has a direct geometrical interpretation. The spinor V determines both 
the position r by (4.5) and the plane of motion in position space by (4.47). 
Conversely, given the position r and the plane of motion specified by h, then 
V is determined uniquely (except for sign) by Equations (4.5) and (4.47). 
Thus, V provides a unique and direct description of the position and plane of 
motion at every time. 

A further advantage of using V appears when we relate it to the spinor R 
which determines the Kepler frame 


= Rto,R (4.48) 


(k = 1, 2, 3). This frame, with 6, = a, X o,, is specified by the physical 
conditions 


eh aRenh (4.49) 
and 
e, = Rta,R =, (4.50) 


where € is the eccentricity vector pointing towards periapse of the osculating 
orbit. 

Equations (4.47), (4.49), and (4.50) determine a unique factorization of the 
spinor State function V into 


Ea (4.51) 


where Z and R can be regarded as ‘internal’ and ‘external’ state functions 
respectively. Consistency of (4.47) with (4.49) implies that 


LOL = 16;. 
Hence, we can write Z in the form 


Z= Gere. (4.52) 
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Then, using (4.47) and (4.50) we obtain 
f= 4.V—R'o Z°R=ree™’. (4.53) 


This exhibits 6 as the true anomaly of the osculating orbit. 

The internal state function Z describes the size and shape of the osculating 
orbit as well as location on the orbit. If we take the eccentricity ¢ = |e|, the 
angular momentum h = |hj, and the true anomaly as internal state variables, 
then Z is a determinate function Z = Z(e, h, 9) of these variables. Actually, 
we can identify Z with U in the unperturbed case, so, according to (4.42) and 
(4.45), it is better to choose €, a, s as internal state variables, so Z = Z(e, h, s). 

Although the fixed reference frame {o,} can be chosen arbitrarily, it will 
most often be convenient to associate it with an initial osculating orbit of the 
particle. For Kepler motion the best choice is 


ee and 6, ae (4.54) 


where h, is the initial angular momentum and g, is the initial eccentricity 
vector. The initial value of the spinor V is then 


V, = Ze = (fp = rat aaa (4.55) 


where 8, is the initial true anomaly. 

The external state function R determines the attitude of the osculating orbit 
in position space. Of course, R is exactly the attitude spinor used in Sections 
8-2 and 8-3. 

The factorization V = ZR should be of value in perturbation theory, 
because it admits a systematic separation of perturbation effects determined 
by the geometry of the orbital elements. Unfortunately, the spinor equation 
(4.30) loses its simplicity when translated into an equation for V instead of U. 
Although, of course, V can be identified with U in the absence of perturba- 
tions. On the other hand, if the factorization V = ZR is used, it might be best 
to work with a pair of weakly coupled equations for R and Z, but we cannot 
pursue that theme here. 


Chapter 9 


Foundations of Mechanics 


Now that we have become familiar with the content and applications of 
mechanics, we are prepared to examine its conceptual foundations systemati- 
cally. This calls for an explicit formulation and analysis of all presuppositions 
of the theory. It goes beyond a mere statement of Newton’s laws to an 
analysis of the status of laws in a theory and nature of scientific theories in 
general. This kind of study belongs to the philosophy of science, but it is no 
mere academic exercise. The profound revolutions in physics due to Newton 
and Einstein were changes in the conceptual foundations resulting from 
careful analysis. So it takes a study of foundations to fully understand the 
evolution of physics, or, if the facts demand, to instigate a new revolution. 
Improvements in the foundations are truly revolutionary, because they are so 
rare and their repercussions are so extensive, bearing on every application of 
the theory. 

Newton’s original formulation of mechanics nearly 300 years ago is fol- 
lowed with little change in most mechanics books even today. Nevertheless, it 
is not entirely satisfactory for several reasons. First, it is incomplete in the 
sense that not all major assumptions of the theory are explicitly spelled out. 
Second, in the last century Newtonian theory has undergone profound 
modifications and extensions which should be taken into account. To begin 
with, Einstein’s Theory of Relativity has revolutionized the scientific concepts 
of space and time. We now know that any adequate formulation of space and 
time has empirical content with testable consequences. So a clear and explicit 
formulation of these concepts is scientifically as essential to Newtonian 
mechanics as it is to relativity theory. Pedagogically, it is needed to help 
students distinguish between their own vague intuitions of space and time and 
an objective scientific formulation of these concepts. Fortunately, the formu- 
lation can be designed so a small change in the concept of simultaneity 
generates a smooth transition from Newtonian mechanics to relativistic 
mechanics. 

Another big change in mechanics since Newton has been brought about by 
the development of the field concept. Even introductory physics courses 
move rapidly from interactions between particles to interactions of particles 
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with electric and magnetic fields. We need a formulation of Newton’s laws 
which readily accomodates this profound theoretical change. We need to 
provide for a smooth transition from pure particle mechanics to the classical 
theory of fields and particles. 

A modern formulation of mechanics should also incorporate profound 
changes in the concept of a theory which have evolved since Newton. Today it 
is widely recognized that physics is concerned with constructing and testing 
mathematical models of physical systems. Thus, the concept of a mathemat- 
ical model is central to the modern conception of a scientific theory. Yet 
physics textbooks scarcely mention models, let alone explain that mathemat- 
ical modeling is the essential core of the scientific method. 


9-1. Models and Theories 


Philosophy is written in that great book which ever lies before 
our eyes — I mean the Universe — but we cannot understand it 
if we do not first learn the language and grasp the symbols in 
which it is written. This book is written in the mathematical 
language, and the symbols are triangles, circles, and other 
geometrical figures, without whose help it is impossible to 
comprehend a single word of it; without which one wanders in 
vain through a dark labyrinth. * 

Galileo Galilei 


This magnificent passage is the capstone of Galileo’s great intellectual 
achievements. It is the first incisive formulation of a philosophical viewpoint 
which played a crucial role in the development of modern science. This 
viewpoint has been so thoroughly assimilated into modern science that most 
scientists take it for granted without recognizing that a profound issue 1s 
involved. On the other hand, it is still debated endlessly in philosophical 
circles, where it is called scientific realism. The importance that Galileo 
himself attached to the above passage 1s clear from his order that it be placed 
at the head of his collected works. 

Scientific realism must be distinguished from the naive realism of common 
sense. The presumption common to all forms of realism is that a “‘real world” 
of things exists independently of any person to observe them. According to 
common sense, things in the real world are just as we see them; they are 
known to us directly through experience, provided the senses are operating 
properly so the view is not distorted. But, as Galileo puts it, scientific realism 
holds that the real world is known only indirectly; it is merely posed to us 
through the senses as a cipher, so to know real things we must decode the 


*Translation from p. 67 of E. A. Burtt, The Metaphysical Foundations of Modern Science, 
Routledge and Kegen Paul LTD, London (1932). Burtt gives a historical account of the origins of 
scientific realism. 
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messages of experience. Moreover, the code can be broken only by recogniz- 
ing that geometrical properties of things are primary, and we can know them 
only conceptually by representing them mathematically. 

Galileo’s profound scientific realism evolved from long contemplation and 
a variety of astute observations. Throughout his writings Galileo was occu- 
pied with an analysis of experience to distinguish the “primary properties” 
essential to real objects from ‘‘secondary properties” which depend on the 
mode of human sensation. The analysis was continued by Descartes and 
Boyle among others, and it was a crucial preliminary to Newton’s definitive 
formulation of mechanics in the Principia, from which all reference to 
secondary properties was banished. This decisive step severed psychology 
cleanly from physics, enabling physics to progress without being distracted by 
the complexities of subjective experience. It is the basis today for such 
distinctions as between the perceived color of light (a secondary property) 
and the frequency of light (a primary property), or the pitch of a tone and its 
frequency. The properties ascribed to objects by physics, such as mass, 
velocity, force and frequency, are very different from the directly perceived 
properties of things. Physical properties are primary properties which can be 
represented as quantities. Thus, the distinction between primary and second- 
ary properties was a crucial preliminary to developing a mathematical theory 
of the real world. 

In this chapter we adopt a modern version of scientific realism, which holds 
that objective knowledge about the real world is obtained by developing 
validated mathematical models to represent real objects. Scientific realism 
maintains a sharp distinction between a physical thing and its model, between 
the real world of physical things and the mental world of concepts. One 
should realize, however, that this dualism is only methodological. It by no 
means requires that the physical and mental worlds exist independently of 
one another. It is entirely compatible with an explanation of mental phenom- 
ena in terms of physical brain states. Indeed, the distinction between primary 
and secondary properties opens the possibility of explaining secondary 
properties in terms of primary properties. But this is an issue for neuropsy- 
chology to investigate. What matters here is that scientific realism holds that a 
clear distinction between physical things and their models can be made and 
must be maintained against the contrary tendencies of natural language which 
is infected with naive realism. 

Scientific realism has been vigorously challenged recently by physicists and 
philosophers who hold that it is incompatible with quantum mechanics. They 
claim that quantum mechanics does not allow a sharp separation between the 
state of a real object and an observer’s knowledge of that state. We cannot get 
involved in that debate here. Suffice it to say that the issue has not been 
resolved to the satisfaction of all concerned physicists. Without further 
apology, in this chapter we strive for a sharply formulated theory of scientific 
knowledge from the viewpoint of scientific realism. 
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Models 


The term “‘model”’ is often used in the scientific literature with only a vague 
meaning. To sharpen the concept of model, we need terminology which 
expresses clear distinctions and specifications. We assume that a model is a 
conceptual representation of a real object. The represented object is said to 
be a referent of the model. A model may have more than one referent. For 
example, a model of the hydrogen atom has all hydrogen atoms for referents, 
while a model of the solar system has a single referent. The set of all referents 
of a model is called its reference class. If its reference class is empty, a model is 
said to be fictitious. An assignment of a particular referent or reference class 
to a given model is called a factual interpretation of the model, or a physical 
interpretation if the model belongs to physics. A single model may be given 
many different factual interpretations, especially in a mature science like 
physics. For example, the one-dimensional harmonic oscillator may be inter- 
preted as a model for such diverse objects as an elastic solid, a pendulum, a 
diatomic molecule or an atom. 

We are concerned here with mathematical models, though much of our 
discussion applies more generally. A mathematical model has four components: 


(1) A set of names for the object and agents that interact with it, as well as 
for any parts of the object represented in the model. 

(2) A set of descriptive variables (or descriptors) representing properties of 
the object. 

(3) Equations of the model, describing its structure and time evolution. 

(4) An interpretation relating the descriptive variables to properties of 
objects in the reference class of the model. 


Each of these components needs some explication. 

Numerals are often used as object names; thus, we may speak of “particle 
1° and “particle 2”. Descriptive variables are functions of the object names, 
since each descriptor represents a property of a particular object. For exam- 
ple, the velocity descriptor v, for the kth object in a system is an explicit 
function of the object name k. Often, however, the dependence of descriptors 
on object name is tacitly understood, as when we write v for the velocity ot 
some object. 

There are three types of descriptors: object variables, state variables and 
interaction variables. 

Object variables represent intrinsic properties of the object. For example, 
mass and charge are object variables for a material particle, while moment of 
inertia and specifications of size and shape are object variables for a rigid 
body. The object variables have fixed values for a particular object, but they 
have different values for different objects, so they are indeed variables from a 
general modeling perspective. 
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State variables represent intrinsic properties with values which may vary 
with time. For example, position and velocity are state variables for a 
particle. A descriptor regarded as a state variable in one model may be 
regarded as an object variable in another model. Mass, for example, is a state 
variable in a model that allows it to change, though it is usually constant in 
particle models. Thus, object variables can be regarded as state variables with 
constant values. 

An interaction variable represents the interaction of some external object 
(called an agent) with the object being modeled. The basic interaction 
variable in mechanics is the force vector; work, potential energy and torque 
are alternative interaction variables. 

Different kinds of property can be distinguished by characteristics of their 
representations as descriptive variables. A property is said to be quantitative if 
it can be represented by mathematical quantities, such as elements of the 
Geometric Algebra. Otherwise it is said to be qualitative. Physics is concerned 
with a particular set of quantitative properties called physical properties. The 
corresponding descriptors are called physical variables or physical quantities. 

The equations of a mathematical model describe relations among quanti- 
tative properties. Equations determining the time evolution of the state 
variables are called dynamical equations, or equations of motion in mech- 
anics. In a mature scientific theory, the equations are derived from laws of the 
theory. Otherwise, they must be assumed as hypotheses subject to verification. 

It is common practice in the literature to say that a particular dynamical 
equation constitutes a mathematical model. This should be recognized as a 
loose use of language, for an equation represents nothing unless its variables 
are given factual interpretations. 

The interpretation of a model is specified by a set of attribute functions for 
its properties. The set of objects with a given property is called the scope or 
reference class of that property. The attribute function for a property assign 
particular values of the descriptive variable to objects in its reference class. 
When specific numerical values are assigned to certain variables, these 
variables are said to be instantiated. As examples of instantiation in particle 
mechanics, we have the assignment of a particular mass to a particle or 
particular initial conditions for its trajectory. 

When, for specific instantiations, the equations of a model are sufficient to 
determine specific values for all its descriptors, the model is said to be a 
specific model. A specific model can thus describe a particular object under 
particular circumstances. 


Theory 


Evidently we have tacitly employed a theory of some sort in specifying the 
general characteristics of a model. A vaguely defined theory of this sort is 
frequently called Systems Theory in the scientific literature; although it is 


Models and Theories 579 


seldom formulated in the generality we need here. We may regard Systems 
Theory as a theory of theories, or more specifically, a general theory of 
mathematical models. Thus, Systems Theory specifies the characteristics of 
models common to all scientific theories. Consider, for example, the distinc- 
tion between state variables and interaction variables in a model. That 
distinction was first sharply drawn in mechanics. But, as other theories 
developed, many people noticed that the distinction has a wider significance if 
the concepts of state and interaction are suitably generalized. Based on this 
distinction, Systems Theory goes on to describe how complex objects can be 
modeled as systems in interacting parts. Thus, it provides a general theory of 
structure and composition of objects of any kind. This too is a generalization 
of concepts developed in physics. A complete development of Systems 
Theory will not be attempted here. However, the general characterization of 
a scientific theory, to which we now turn, may be regarded as part of Systems 
Theory. 

A scientific theory can be regarded as a system of design principles for 
modeling real objects. The theory consists of: 

I. A framework of generic and specific laws characterizing the descriptive 

variables of the theory. 
II. A semantic base of correspondence rules relating the descriptive vari- 
ables to properties of real objects. 
III. A superstructure of definitions, conventions and theorems to facilitate 
modeling in a variety of situations. 
The mathematical language used to formulate a theory is usually taken for 
granted. However, it should be recognized that most of the mathematics used 
in physics was developed to meet the theoretical needs of physics. In Chapter 
1, we saw that this is true of the real number system and its generalization to 
Geometric Algebra. Moreover, differential equations were first invented to 
formulate dynamical laws of physics. The moral is that the symbolic calculus 
(mathematics) employed by a scientific theory should be tailored to the 
theory, not the other way around. 

The key concept in a scientific theory is the concept of a scientific law, so it 
should be explicated carefully. A scientific law is a relation or system of 
relations among descriptive variables presumed to represent an objective 
relation or pattern among the corresponding properties. If the relation is among 
physical variables, it is called a physical law. Most physical laws are formu- 
lated as mathematical equations. Scientific realism maintains that it 1s import- 
ant to distinguish between a law and the objective pattern it represents, 
because the latter is an unchanging property of the real world while the 
former may be changed when we understand the world better. Moreover, a 
law may be true or false or approximately true, but the property pattern it is 
presumed to represent just “is”. To qualify as a law, a relation among 
descriptive variables must represent a property which is universal in the sense 
that its scope is not limited to a finite number of objects, and it must be 
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corroborated in some empirical domain by scientific methods. A proposed law 
which has not been experimentally tested and confirmed is called a hypoth- 
esis. Thus, a law is a corroborated hypothesis. 

There are several types of law. The generic laws of a theory define the basic 
descriptive variables of the theory. The generic laws of Classical Mechanics 
fall into two groups: (a) The Zeroth Law, which defines the concepts of 
position, motion and composition of bodies, and (b) Dynamical Laws (New- 
ton’s Laws), which implicitly define the concepts of mass and force. The 
Zeroth Law is so general that it belongs to every physical theory; indeed, it is 
presumed (tacitly at least) in every scientific theory. The Dynamical Laws 
apply only to material objects. In Sections 9-2 and 9-3, these laws will be 
formulated and discussed in detail. 

The specific laws of a theory specify relations among the descriptive 
variables defined by the generic laws. As a rule they apply only to special 
circumstances, whereas the generic laws are presumed to hold in every 
application of the theory. The specific laws of Classical Mechanics are interac- 
tion laws such as Coulomb’s Law, Newton’s Law of Gravitation and Stokes’ 
Law of fluid friction. 

Taking the Zeroth Law for granted, the other basic laws of any scientific 
theory can be classified into dynamical laws, which determine the time 
evolution of state variables, and interaction laws, which interrelate the state 
variables of different objects. 

The basic laws of a theory are included in the theory by assumption. The 
superstructure of the theory also contains derived laws, such Galileo’s law of 
falling bodies. As a rule, the scope of a basic law is much wider than the scope 
of a derived law. 

We must be clear about what it means to say that concepts like motion and 
mass are defined by generic laws. All sorts of unnecessary difficulties are 
caused by a sloppy or inadequate concept of definition, so it will be worth our 
while to explicate the concept. The purpose of a definition is to establish the 
meaning of a concept (or the term (symbol) which designates it) by specifying 
its relation to other concepts (terms). When this has been done, we say that 
the concept (term) is well-defined. There are two ways to do it, yielding two 
kinds of definition: explicit and implicit. 

A concept is defined explicitly by expressing it in terms of other concepts. 
This is the conventional notion of definition, used, for example, in defining 
the kinetic energy K by the equation K = mv’/2. 

A concept (term) is defined implicitly by a set of axioms which relates to it 
other concepts (terms). Thus, the concept of ‘‘point” is defined by the axioms 
of geometry which specify its relations to other points, lines and planes. 
Similarly the concept of ‘‘vector” is defined implicitly by specifying how to 
add and multiply vectors. In each case axioms define concepts by specifying 
relations. Axioms are set apart from other statements or equations by 
accepting them as definitions, so they need not be proved. Nevertheless, 
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terms like “point” and ‘‘vector” introduced by axioms are commonly said to 
be ‘undefined terms’. This is a misleading expression that ought to be 
discarded. Novices often interpret it in the sense of “‘ill-defined’’ or ‘‘ob- 
scure’’. At least they find it unnecessarily mysterious. Evidently it conflicts 
with established usage of the term ‘‘well-defined”’. It would be better to say 
that ‘‘some terms in a theory must be defined implicitly”’ rather than ‘“‘some 
terms must be undefined”’. 

Generic laws are axioms defining basic descriptive variables. Our definition 
of model might have given the impression that descriptive variables can be 
defined independently of any laws. But why are descriptive variables scalar- 
or vector-valued, that is what makes them quantitative? It will be seen that 
this is a consequence of the Zeroth Law, which introduces geometrical 
attributes into every physical theory. The generic laws of space and time are 
usually taken for granted, so they are seldom mentioned in the formulation of 
a model. They are essential, nevertheless. A variable which is undefined by 
laws is completely nondescript; it is no more than a name. To be definite 
concepts, descriptive variables must be well-defined by laws. 

Newton’s Laws are sometimes called axioms. That invites confusion be- 
tween the purely mathematical concept of an axiom and the factual concept of 
a law. A law is an axiom, but the converse is not true. A physical law is an 
axiom with a physical interpretation. 

The correspondence rules of a theory determine factual interpretations for 
its descriptive variables and laws, and so for models designed with it. They 
include operational procedures for measurement, that is, the assignment of 
particular values for the descriptors of particular objects. Thus, they deter- 
mine attribute functions relating descriptors to the properties they represent. 
The correspondence rules are not independent of physical laws; rather they 
are specified in accordance with the laws. For example, any operational 
procedure for measuring length must be consistent with the Euclidean proper- 
ties of physical space, as specified by the Zeroth Law. Moreover, the law of 
physics often enable us to measure the same physical quantity in many 
different ways, so the results of measurement must be independent of the 
particular procedure employed. 

A correspondence rule for measuring a physical quantity is often called an 
‘operational definition’. But this is an abuse of language, confusing the 
concepts of definition and measurement. A definition, whether explicit or 
implicit, relates concepts to concepts, not concepts to things. Mario Bunge 
has suggested that the term “operational definition” be replaced by ‘‘oper- 
ational referition’’, since it is concerned with the semantic concept of refer- 
ence; it relates a descriptor (a concept) to its referents (things). 

The set of real objects which can be modeled with a theory is called the 
reference class of the theory. The reference class of Classical Mechanics is 
enormous, the set of all material bodies. Yet the generic laws of Mechanics 
model a very small number of properties. The theory asserts that these are 
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properties that all bodies have in common, so we call them basic properties. The 
fact that the generic laws describe only basic properties does not mean that 
other properties cannot be described by the theory. A composite body has 
new properties not possessed by its parts which emerge when it 1s assembled. 
They are called emergent properties. This challenges theory to explain emer- 
gent properties in terms of basic properties. Indeed, it challenges physicists to 
explain all physical properties of matter — geometrical, mechanical, electri- 
cal, thermodynamic, optical — in terms of a small number of basic properties. 
This grand challenge has long been a major motivation for research. 

The emergent geometrical properties of size and shape can be explained in 
terms of basic properties by the Zeroth Law, which incorporates the physical 
content of Greek geometry. Geometry can be regarded as the theory of size 
and shape. This may be obvious, but it is far from trivial, as witnessed by the 
whole field of architectural design. The Kinetic Theory of Gases is a sub- 
theory of Mechanics which explains temperature as an emergent property. 
The problem of explaining all thermodynamic properties as emergent from 
physical properties of molecules is so complex that a separate theory, Statisti- 
cal Mechanics, has been developed to handle it. More specialized theories 
like Plasma Physics, Solid State Physics and Theoretical Chemistry are also 
concerned with explaining emergent properties. All these theories are founded 
on Classical Mechanics as well as Quantum Mechanics. 

Having discussed the general features of models and theories, let us turn 
now to a formulation of generic laws for Classical Mechanics. 


9-2. The Zeroth Law of Physics 


Everyone has well-developed notions of space and time abstracted from 
personal experience. Perceptual categories of space and time are essential 
for sorting out sensory data. However, perceptual space and time must 
sharply be distinguished from the concepts of physical space and time. The 
former is a modus operandi of the human brain — the proper study of 
psychology, psychophysics and neuroscience. It provides an intuitive base for 
the physical concepts. But the concepts of physical space and time are 
objective rather than intuitive. Intuitive concepts are subjective, which is to 
say that they vary from person to person; whereas objective concepts are the 
same for everyone. Objectivity is achieved in science by providing concepts 
with explicit mathematical definitions and factual interpretations in terms of 
rules which might be applied by anyone, or by a computer for that matter. Of 
course, everyone’s conception of space and time combines intuitive and 
objective components. But only the objective component will concern us 
hexe. 

Objective concepts evolve with changes in their definitions and interpret- 
ations. Since Newton’s day two major improvements in the concepts of space 


The Zeroth Law of Physics 583 


and time have evolved which should be incorporated into the foundations of 
mechanics. First, we have learned to distinguish between mathematical and 
physical geometries. Scientific realism regards physical geometry as a feature 
of the real world which we model with a mathematical geometry. Thus, our 
model geometries should be subjected to empirical tests. In Newton’s day no 
one had conceived of an alternative to Euclidean geometry or the idea of 
testing it, though, of course, it had been subjected to many crude informal 
tests when employed in architectural design and construction. Alternatives to 
Euclidean geometry were first conceived by mathematicians in the nineteenth 
century, but none was incorporated into a viable physical theory until Ein- 
stein’s General Theory of Relativity in the twentieth century. We shall formu- 
late a Euclidean model of physical geometry, since that is appropriate for 
classical mechanics. But we aim to do it in a way which makes its ‘“‘physical 
content” explicit, and allows for easy generalization to “relativistic theories.” 

The second major improvement in concepts of space and time is due mainly 
to Einstein. He recognized that the concept of distant simultaneity is an 
essential part of the time concept which had not previously been explicitly 
defined in classical physics. Rather, physicists had unwittingly adopted an 
implicit concept of simultaneity which was inconsistent with ideas of causality 
and experimental fact. By supplying an appropriate definition of distant 
simultaneity and analyzing its consequences, Einstein created his Special 
Theory of Relativity. Thus, the Special Theory is best regarded as a com- 
pletion of classical physics with a full elucidation of the time concept. 

The change instituted by Einstein in the classical time concept appears to 
be comparatively small, but its consequences are immense. It implies that 
space and time are relative concepts which cannot be defined independently 
of one another and do not correspond to unique features of the real world. It 
implies that the real physical geometry is a non-Euclidean geometry of a 
4-dimensional entity space-time, with respect to which the separate concepts 
of space and time only describe the viewpoint of a particular observer. Thus, 
a small change in the time concept has profoundly altered the physicists’ 
conception of reality. 

The Special Theory of Relativity will be discussed in a sequel to this book, 
NF II. Here we will be content with preparing the way for a smooth transition 
to the modern space-time concept by elucidating the classical concepts of 
space and time. We begin with the concept of space. 

The problem of providing the concept of space with a precise mathematical 
formulation has been solved to nearly everyone’s satisfaction. But physicists 
are still far from agreement on the physical status of space. Is space a thing or 
a property of things? Or is it a property of the human mind, a “category of the 
understanding,” as the philosopher Immanual Kant proposed? Every kind of 
answer can be found in the literature. This attests to widespread confusion 
about the conceptual foundations of physics. Confusion is perpetuated by an 
outmoded concept of space which infects our natural language. Thus, we 


584 Foundations of Mechanics 


speak of physical objects in space as if space were a container with an 
existence independent of its contents. The literature shows that physicists are 
not immune to this infection, but a cure can be achieved by a careful 
conceptual analysis. The source of the infection is easy to identify. The 
natural language was developed to describe features of perceptual experi- 
ence, which it can do with remarkable fidelity. The brain does, indeed, 
contain a sensorium, a carrier of perceptions which exists independently of its 
contents. This is reflected in perceptual experience and so in the natural 
language. Thus, a cure for confusion about the nature of space begins with a 
clear distinction between the perceptual space of subjective experience and 
the objective concept of physical space. The complete cure requires a rigor- 
ous formulation of the physical concept in perfect accord with experimental 
practice. 

To ascertain a suitable physical interpretation for the concept of space, we 
must examine the role of geometry in experimental practice. We note that 
every measurement of distance determines a relation between two objects. 
Every measurement of position determines a relation between one object and 
some other object or system of objects. In accordance with the standpoint of 
scientific realism, we regard such measured relations as representations of 
real properties of real objects. These are mutual (or shared) properties 
relating one object to another. We call them geometrical properties. We are 
now prepared for an explicit formulation of physical space as a system of 
relations among physical objects. 

To begin with, we recognize two kinds of objects, particles and bodies 
which are composed of particles. Given a body % called a reference frame, 
each particle has a geometrical property called its position with respect to ®. We 
characterize this property indirectly by introducing the concept of Position 
Space, or Relative Space, if you prefer. For each reference frame 7, a position 
space /’is defined by the following postulates: 


A. / is a 3-dimensional Euclidean space. 
B. The position (with respect to “) of any particle can be represented as a 
point in 7% 


The first postulate specifies the mathematical structure of a position space 
while the second postulate supplies it with a physical interpretation. Thus, the 
postulates define a physical law, for the mathematical structure implies 
geometrical relations among the positions of distinct particles. Let us call it 
the Law of Spatial Order. 

Notice that this law asserts that every particle has a property called position 
and it specifies properties of this property. But it does not tell us how to 
measure position. Measurement is a separate matter, since it entails corre- 
spondence rules as well as laws. In actual practice the reference frame is often 
fictitious, though it is related indirectly to a physical body. Our discussion is 
simplified by feigning that the reference frame is always a real body. 
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We turn now to the problem of formulating the scientific concept of time. 
We begin with the idea that time is a measure of motion, and motion is a 
change of position with respect to a given reference frame. The concept of time 
embraces two distinct relations: temporal order and distant simultaneity. To 
keep this clear we introduce each relation with a separate postulate. 

First we formulate the Law of Temporal Order: 


The motion of any particle with respect to a given reference frame can be 
represented as an orbit in position space. 


This postulate has a semantic component as well as a mathematical one. It 
presumes that each particle has a property called motion and attributes a 
mathematical structure to that property by associating it with an orbit in 
position space. Recall that an orbit is a continuous, oriented curve. Thus, a 
particle’s orbit in position space represents an ordered sequence of positions. 
We call this order a temporal order, so we have attributed a distinct temporal 
order to the motion of each particle. 

To define a physical time scale as a measure of motion, we select a moving 
particle which we call a particle clock. We refer to each successive position of this 
particle as an instant. We define the time interval At between two instants by 


/NSE=e GAT 


where c is a positive numerical constant and As is the arclength of the clock’s 
orbit between the two instants. Our measure of time is thus related to the 
measure of distance in position space. 

To use this time scale as a measure for the motions of other particles, we 
need to relate the motions of particles at different places. The necessary 
relation can be introduced by postulating the Law of Simultaneity: 


At every instant, each particle has a unique position. 


This postulate determines a correspondence between the points on the orbit 
of any particle and points on the orbit of a clock. Therefore, every particle 
orbit can be parametrized by a time parameter defined on the orbit of a 
particle clock. 

Note that this postulate does not tell us how to determine the position of a 
given particle at any instant. That is a problem for the theory of measure- 
ment. 

So far our laws permit orbits which are nondifferentiable at isolated points 
or even at every point. These possibilities will be eliminated by Newton's laws 
which require differentiable orbits. We include in the class of allowable 
orbits, orbits which consist of a single point during some interval. A particle 
with such an orbit is said to be at rest with respect to the given reference frame 
during that interval. Of course, we require that the particles composing the 
reference frame itself be at rest with respect to each other, so the reference 
frame can be regarded as a rigid body. 
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Note that the speed of a particle is just a comparison of the particle’s 
displacement to the displacement of a particle clock. The speed of the particle 
clock has the constant value c = As/At, so the clock moves uniformly by 
definition. In principle, we can use any moving particle as a clock, but the 
dynamical laws we introduce later suggest a preferred choice. It 1s sometimes 
asserted that a periodic process is needed to define a clock. But any moving 
particle automatically defines a periodic process, because it moves successively 
over spatial intervals of equal length. It should be evident that any real clock can 
be accurately modeled as a particle clock. By regarding the particle clock as the 
fundamental kind of clock, we make clear in the foundations of physics that the 
scientific concept of time is based on an objective comparison of motions. 

We now have definite formulations of space and time, so we can define a 
reference system as a representation x for the possible position of any particle 
at each time t in some time interval. Each reference system presumes the 
selection of a particular reference frame and particle clock, so x is to be 
interpreted as a point in the position space of that frame. Also, a reference 
system presumes the selection of a particular origin for time and space and 
particular choices for the units of distance and time, so each position and time 
is assigned a definite numerical value. The term “reference system”’ is 
sometimes construed as a system of procedures for constructing a numerical 
representation of space and time. 

After we have formulated our dynamical laws, it will be clear that certain 
reference systems called inertial systems have a special status. Then it will be 
necessary to supplement our Law of Simultaneity with a postulate that relates 
simultaneous events in different inertial systems. That is the critical postulate 
that distinguishes Newtonian theory from Special Relativity, but we defer 
discussion of it until we are prepared to handle it completely. It is mentioned 
now, because our formulation of space and time will not be complete until 
such a postulate is made. 

It is convenient to summarize and generalize our postulates with a single 
law statement, the Zeroth (or Spatiotemporal) Law of Physics: 


Every real object has a continuous history in space and time. 


To explicate this law, we assert that it has four major components: 
1. The Law of Spatial Order. 

2. The Law of Temporal Order. 

3. The Law of Simultaneity. 

4. The Generic Law of Composition. 


The Generic Law of Composition asserts: 


The properties of any real object can be represented mathematically by 
the values of a state function defined on the position and time variables of 
a given reference system. 
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A Specific Law of Composition imposes some condition on the nature of the 
state function for a particular object or class of objects. The successive values 
of the state function as a function of time describes the history of the object. 
The Zeroth Law does not specify the history of any object; dynamical laws 
determining the state function are needed for that. But it does assert that 
every object has a history. 

In classical physics, every model of a real object is one of three kinds: 
particle, body or field. Each model is distinguished by a particular state 
function. We have already specified the state function for a particle, namely, 
the function x = x(t) for its orbit in position space. A material particle also has 
a property called mass, so a complete state function must specify any time 
variation of the mass. 

A body is an extended object, which is to say that more than one point in 
Position Space is required to specify its location. We have modeled bodies as 
systems of particles. In this case, the /ocation of a body is the set of positions 
of its particles, and its history is the set of particle histories. Alternatively, a 
material body (or material medium) can be modeled as a spatially continuous 
object which does not have a unique decomposition into particles. 

A field is also an extended object, but its state function as well as its 
physical interpretation is quite different from that of a body. The Classical 
Theory of Fields will be discussed in NF II. By way of illustration, let us only 
note here that the theory asserts the existence of real objects called electric 
fields, each of which can be represented by a vector-valued state function E(x, f). 

The Zeroth Law applies to Quantum Mechanics as well as classical physics, 
but the state functions for particles are different. The quantum mechanical 
state function for an electron will be discussed in NF II. 

The Zeroth Law is the most universal of all scientific laws. It asserts that 
every real thing that ever existed or will exist has definite spatiotemporal 
properties, that is, definite spatiotemporal relations to every other real thing. 
Some aspect of the Zeroth Law is presumed in every scientific theory and 
investigation. 

Other scientists can take the Zeroth Law for granted, but physicists are 
responsible for refining its formulation and testing its consequences. The 
present formulation has been designed for compatibility with the Special 
Theory of Relativity, and it can be directly generalized to the ‘“‘curved 
spacetime” of the General Theory of Relativity, but that must be left for 
another book. The mathematical structure attributed to space and time in our 
formulation is widely accepted by physicists, but the physical interpretation is 
controversial. We have adopted a relational view, interpreting space and time 
as a system of relations among real objects. But some physicists prefer a 
material view, interpreting spacetime as a primal material out of which all 
things are composed, so that objects can be regarded as local variations in the 
properties of spacetime. Thus, the material view interchanges the objects and 
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properties of the relational view. Is there a definite empirical distinction 
between these two interpretations so we can decide on one over the other? 
That is a profound question which will not be easily answered. At least both 
interpretations are consistent with scientific realism. 


9-3. Generic Laws and Principles of Particle Mechanics 


The spatiotemporal properties of real objects are described by the Zeroth 
Law. To produce a complete physical theory, the Zeroth Law must be 
supplemented by a set of dynamical laws which describe the nature and effect 
of interactions between objects. In Particle Mechanics the interaction prop- 
erty is represented by force functions. A set of generic laws implicitly define 
the concepts of mass and force and assign them a physical interpretation. To 
produce a specific model of interacting particles, the generic laws must be 
supplemented by specific force laws which specify definite force functions. 

Our formulation of the general theory consists of four generic laws, one 
hypothesis and three generic principles. Let us present them all at once, and 
then comment on each one separately. Of course, our formulation presumes 
the Zeroth Law, so the notions of particle, time, position, velocity and 
acceleration are all well-defined. In addition, the formulation is presumed to 
hold only for a certain kind of reference system called an inertial system, 
which is implicitly defined by the First Law. Now we are ready. 


First Law (Law of Inertia): 
In an inertial system, every free particle has a constant velocity. A particle is 
said to be free if the total force on it vanishes. 


Second Law (Law of Causality): 
The total force exerted on a particle by other objects at any specified time 
can be represented by a vector f such that 


f = ma, 


where a is the particle’s acceleration and m is a positive scalar constant 
called the mass of the particle. 


Third Law (Law of Reciprocity): 
To the force exerted by any object on a particle there corresponds an equal 
and opposite force exerted by the particle on that object. 


Fourth Law (Superposition Law): 
The total force f due to several objects acting simultaneously on a particle is 
equal to the vector sum of forces f, due to each object acting independently. 
that is, 
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f= Da f,. 
k 
To relate formulations of the laws in different inertial systems, we adopt the 


Hypothesis of Absolute Simultaneity: 
Local events which are simultaneous in one inertial system are simultaneous 
in every inertial system. 

A local event is defined as a change in the position or velocity of a particle. 


Specific force laws need not be regarded as part of the general theory. 
However, they are restricted in form by generic principles. The principles 
function as laws when force functions are unknown. In particular, they 
sharpen the general concept of force defined by the generic laws. However, 
when we have specific force laws that satisfy the principles, the principles are 
superfluous. For this reason we do not call them laws. In Section 3-1 we 
introduced The Principle of Analyticity. Two other principles are important: 


The Principle of Local Interaction: 
The force on a particle at any time is a unique function of particle position 
and position time derivatives; it is independent of the particle’s past or future 
history. 


The Principle of Relativity: 
The laws of mechanics have the same functional form in all inertial systems. 


Comments on the First Law 


The First Law implicitly defines a time scale for an inertial system. For it 
requires that displacements of the system’s particle clock are proportional to 
displacements of a free particle. This amounts to requiring that equal inter- 
vals of time be defined by equal displacements of a free particle. Thus, the 
motion of a free particle determines the time scale for an inertial system up to 
a multiplicative constant. This fundamental kind of scale is called an inertial 
time scale. 

Besides determining a time scale, the First Law associates straight lines in 
an inertial system with free particle motion. So within an inertial system, the 
deviation of any particle from uniform motion in a straight line can be 
attributed to the action of other objects in accordance with the Second Law. 

The First Law defines an inertial system implicitly by specifying a physi- 
cally-grounded criterion which distinguishes it from noninertial reference 
systems. It tells us that an inertial frame can be identified in principle by 
examining the motion of free particles. In practice, such a procedure is 
usually impossible. For inertial frames do not occur naturally, so most 
measurements are done with respect to an accelerated reference frame. And 
free particles are not usually available for experiments either. Such practical 
difficulties should not be construed as casting doubt on the utility of the First 
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Law. Rather, they pose the experimentalist with the problem of distinguish- 
ing real forces, due to the interaction of objects, from pseudoforces, due to 
the acceleration of his reference frame. He needs the First Law to make such 
a distinction, but he needs the other Laws as well. 

The First Law is evidently not independently of the other Laws, because 
they are needed to define what is meant by “free particles’. 


Comments on the Second Law 


The formula f = ma by itself is frequently presented as a complete statement 
of the Second Law. Although such a formula is an acceptable mathematical 
axiom, a law statement should include a physical interpretation of the math- 
ematical terms it employs. Of course, an interpretation for f could be supplied 
by a separate postulate, but it is best included in the Second Law since that is 
where f first appears in the theory. 

Thus, our formulation asserts that f represents a physical property called 
“force” which a particle shares with other objects. The vector f itself is 
commonly referred to as “the force on a particle’, so f serves also as a name 
for the property it represents. The term “objects” in our law statement is 
presumed to be defined previously by the Laws of Composition as part of the 
Zeroth Law. As said before, we recognize three kinds of objects: particles, 
bodies and fields, and this reduces to two basic kinds if bodies are modeled as 
systems of particles. To produce a pure particle theory, one need only replace 
the word object with the word particle in all our law statements. But our 
formulation generalizes particle theory by allowing interactions with fields. 
Indeed, in a pure field theory particles never interact directly, but only 
through the intermediary of a field. In that case the term “object” in our law 
statements should always be interpreted as ‘‘field,”’ and we need additional 
laws to fully describe the properties of fields. 

It is often claimed that f = ma is a definition of force. On the contrary, an 
explicit definition of force is impossible. Rather, the complete set of generic 
laws is required to define f implicitly by specifying the common characteristics 
of all forces. The equation f = ma represents only one characteristic of force. 
It relates the general property of interaction to the general spatiotemporal 
property of motion. The f represents the action of the universe on a particle 
while the ma represents the particle’s response with a change in its state of 
motion. This provides us with a physical interpretation of mass as a measure 
of the strength of a particle’s response to a given force. No other definition or 
interpretation of mass is needed in the theory. 


Comments on the Third Law 
For two interacting particles, the Third Law can be written 


ft, = iS ’ 
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where f,, is the force of particle 2 on particle 1. This was called the weak form 
on the Third law in Section 6-1. One readily verifies that this relation is 
satisfied by Newton’s gravitational force law and the similar Law of Coulomb. 
However, it fails for direct magnetic interactions between charged particles 
(see Exercise 5). 

This failure of the Third Law for a force law of such great physical 
importance raises a serious problem of determining precisely under what 
conditions the Third Law can be expected to hold and what is responsible for 
its failure. The problem is best addressed by considering the Third Law in a 
different form. For a 2-particle system, the Second Law gives us 


Spi Me 
12 


dt fee PS 


where p, and p, are momenta of the particles. So the Third Law can be 
written 
dp, _ dp, 


dt dt 


Thus, the Third Law can be interpreted as a Law of Momentum Exchange. 
Hence a failure of the Third Law would be a failure of momentum conser- 
vation. Today, physicists regard the Law of Momentum Conservation as more 
fundamental than Newton’s Laws because it holds in Quantum Mechanics as 
well as Classical Mechanics with no known exception. Any apparent violation 
of momentum conservation prompts the question: ‘‘What happened to the 
missing momentum?” On several occasions attempts to answer this question 
have led to the discovery of new physical objects, of which the elementary 
particle called the neutrino is a spectacular example. 

Classical Field Theory accounts for the apparent failure of magnetic inter- 
actions to satisfy momentum conservation by attributing momentum to the 
electromagnetic field. We are not prepared for a quantitative discussion of 
this matter using Field Theory, so we must be content with qualitative 
remarks. Electromagnetic Field Theory allows a particle to interact only with 
fields at the position of the particle. This extends our stated Principle of Local 
Interaction to include field variables. It precludes the possibility of instan- 
taneous interparticle interactions except as an approximation. Rather, the 
interaction between particles is indirect with the field as intermediary. It 
proceeds by a transfer of momentum from one particle to the field; then the 
field transports some of the momentum at the speed of light to the position of 
the second particle where it can be transferred from field to particle, while the 
rest of the momentum may travel freely as electromagnetic radiation. 

The point to be made here is that the Third Law is completely consistent 
with Field Theory if we extend the Principle of Local Interaction and inter- 
pret the ‘‘object” in the law statement as a field. Physicists do not ordinarily 
speak of ‘‘a force exerted by a particle on a field” as in the law statement. But 
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this just means ‘rate of momentum transfer from particle to field”, which is a 
conventional expression. 

For a unified view of physics, particle mechanics should be regarded as an 
approximation to Classical Field Theory. In this approximation, then, the 
Third Law can be applied to particles acting instantaneously at a distance, as 
in Newton’s theory of gravitation. 


Comments on the Fourth Law 


This Law is sometimes regarded as part of the Second Law, but it deserves an 
independent formulation to emphasize its importance. It helps us “divide and 
conquer’ in mechanics by allowing us to decompose complex forces into 
simpler parts for separate analysis, just as the Law of Composition allows us 
to decompose extended bodies into particles. Conversely, it allows us to lump 
a great many forces into a single force to be analyzed as a unit. In a word, the 
Third and Fourth Laws are the main mathematical tools for assembling and 
disassembling interactions. 


Comments on Absolute Simultaneity 


The Hypothesis of Absolute Simultaneity is best regarded as a supplement to 
the First Law. It implies that an inertial time scale set up in one inertial system 
can be employed in any other inertial system, so one time scale suffices for all 
inertial systems. This is equivalent to Newton’s assumption that there exists a 
unique absolute time variable that can be employed in any reference system. 

Absolute simultaneity is called a hypothesis rather than a law here, because 
it is now known to be empirically false, though it is approximately true in a 
large empirical domain. Explicit formulation of this hypothesis, which is 
implicit in Newtonian theory, shows us exactly where Relativity differs from 
Classical Mechanics. Einstein replaced absolute simultaneity with the 


Law of Light Propagation: 
The speed of light is constant with all inertial systems. 


With this law we can use an idealized light pulse or photon to construct a 
model particle clock, a photon clock. The photon clock establishes an inertial 
time scale which is the same for all inertial systems and uniquely relates the 
time scale to the distance scale. Moreover, the Law of Simultaneity which we 
introduced as part of the Zeroth Law can now be reduced to a mere definition 
of simultaneity. All this leads to a conceptual fusion of space and time into a 
unified concept of spacetime. The mathematical formulation and analysis of 
these ideas using geometric algebra will be developed in the following volume 
NF II. It should be mentioned here that the Light Propagation Law requires a 
small but significant alteration of the Second Law because it modifies the 
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concept of time. But no other changes in the laws are needed to give us 
relativistic mechanics. 


Comments on the Local Interaction Principle 


The Principle of Local Interaction is implicit in every treatment of mechanics, 
yet it has not been singled out peviously as a postulate of the general theory. 
It is essential if we are to conclude from the generic laws that specific forces 
determine definite differential equations for particle orbits. And our aim is to 
formulate the general presumptions of mechanics as explicitly and completely 
as possible. 

Our formulation of Local Interaction allows the force to be a function of 
time derivatives of the position vector to any order. As a rule, the velocity is 
the only time derivative to appear in a specific force law. But there is an 
exception of great theoretical importance, namely, the radiative reaction force 
due to the reaction of electromagnetic radiation on a particle emitting it. This 
force law depends on the third time derivative of position. However, this is 
not the place to study it. 

We have already noted the closed relation of Local Interaction to the Third 
Law. In a pure particle theory we can combine these postulates to draw 
conclusions about the functional form of the two particle force. Thus, for a 
force that depends only on position and velocity we find 


f,, = f[x,(d, v.(0), x.(0, v.(t)] = — f[x,(9, v(t), (0), vi(). (4.1) 


The Relativity Principle restricts the function form still further. This signifi- 
cantly restricts the force laws to be considered in a pure particle theory. 


Comments on the Relativity Principle 


The effect of a change in reference system on the equations of motion for a 
particle has already been discussed in Section 5-5. Here we will merely 
comment on its general theoretical implications in accordance with the 
Relativity Principle. We saw that the most general transformation of position 
vectors relating one inertial system to another has the form 


x—x’ =Ri(xtatu(t+4))R, (4.2) 


Where R is a unitary spinor and R, a, u and ¢, are constants. This transforma- 
tion is a composite of a space translation, a Galilean transformation, a time 
translation and a rigid rotation. It maps a particle orbit x = x(t) onto an orbit 
x’ = x'(t') = x'(t + ¢,). Differentiation therefore gives the general velocity 
addition theorem 


kK x =RUk+u)R. (4.3) 


Another differentiation gives us the transformation of the Causality Law, 
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mx =f > mx’ =f’, (4.4) 


showing that its form is unchanged, as required by the Relativity Principle, 
provided the force undergoes the induced transformation 


f— f’ = R'ER. (4.5) 


The Relativity Principle requires more, however. It requires that the func- 
tional form of the force law must be preserved by the transformation (4.2). In 
particular, the two particle force law (4.1) must be of the more restricted form 


fio = f[x, = Gye Xi — x]; (4.6) 


it must depend only on the relative position x, — x, to be form invariant under 
translations, and on the relative velocity x, — x, to be invariant under Gali- 
lean transformations. Moreover, invariance under time translations implies 
that f cannot be an explicit function of time. To be even more specific, if f,2 is 
an algebraic function of x, — x, and x, — x», then (4.5) is automatically a 
consequence of (4.2) and (4.3). This is, indeed, characteristic of the most 
fundamental force laws we have considered. 

Clearly the Relativity Principle is an important modeling principle. It tells 
us that our models should be independent of our chosen (inertial) reference 
system, so interactions should be functions only of relative positions and 
velocities. We have interpreted the transformation (4.2) as a passive change 
in descriptive variables without altering the state of motion of any object. 
Alternatively, for ¢; = 0, we can regard (4.2) as a change in description due to 
an active rigid displacement and boost in velocity of a single reference body 
(or frame). If our models are to be unaffected by such a shift of the reference 
body, as required by the Relativity Principle, we conclude that the reference 
body must not be interacting with real objects. In other words, the reference 
body must be regarded theoretically as fictitious. Of course, real objects are 
needed as reference bodies in experiments. So the Relativity Principle serves 
as a guide to the idealizations required for a theoretical description of 
experiments. 

Another profound implication of the Relativity Principle is found by 
interpreting (4.2) as a transformation with respect to a single reference 
system. The transformation (4.2) maps any orbit x = x(t) onto an orbit 
x’ = x(t + t,) at a different time and place. According to the Relativity 
Principle, these orbits describe physically equivalent (or congruent) proces- 
ses. Thus, the Relativity Principle can be regarded as a general congruence law, 
providing a precise criterion for the equivalence of different physical proces- 
ses at different places and times. This makes it possible to compare results of 
different experiments performed at different places and times. Thus, the 
Relativity Principle provides a theoretical basis for the reproducibility and 
predictability of physical results. 

It should be noted that the Relativity Principle is a semantic principle, 
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because it is concerned with the interpretation of descriptive variables, that is, 
with the relation of models to their referents. It is appropriate to regard the 
Relativity Principle as a “congruence law’’, because it describes an equiva- 
lence relation under rigid transformations in space and time, so it generalizes 
the notion of congruence from elementary geometry. This geometrical char- 
acter of the Relativity Principle shows that it should be grouped together with 
the Zeroth and First Laws. These three laws together determine the model of 
space and time used in classical mechanics, and they must all be modified to 
characterize the model of space-time proposed in Einstein’s General Theory 
of Relativity. 


Comments on the Theoretical Structure of Mechanics 


We have completed our formulation of the generic laws and principles of 
Particle Mechanics. These laws and principles compose an axiom system from 
which all results of the theory can, in principle, be derived as theorems. We 
say “in principle’ because no one has bothered to develop the theory as an 
orderly system of theorems and proofs based on well-defined axioms. A 
major reason for this has been the lack of a complete and appropriate set of 
axioms. Under the influence of recent mathematical fashion, some authors 
have developed axiomatic formulations of mechanics using set theory. But set 
theory is not the right mathematical tool, because it is too general. Conse- 
quently, theorems and proofs in this approach are inordinately unwieldy. 
Geometric algebra is a better tool, because it was designed for the geometri- 
cal job. And our formulation of the axioms conforms well to physical 
practice. 

Of course, we have already derived the results of major interest in me- 
chanics in an informal way, so there is no point to embarking on a formal 
development here. However, it is worth pointing out that formalization of 
mechanics should have some advantages. It can be expected to clarify the 
structure of the theory, eliminate unnecessary redundancy and make results 
more accessible for applications. On the other hand, it must be recognized 
that the organization of mechanics should be dictated by physical rather than 
mathematical considerations. For the purpose of theory is to make specific 
models. 


9-4. Modeling Processes 


Scientific knowledge is of two kinds, factual and procedural. The factual 
knowledge consists of theories, models, and empirical data interpreted (to 
some degree) by models in accordance with theory. A theory is to be 
regarded as factual, rather than hypothetical, because the laws of the theory 
have been corroborated, though theories differ in range of application and 
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corroboration. The procedural knowledge of science consists of strategies, 
tactics and techniques for developing, validating, and utilizing factual know- 
ledge. It is commonly referred to under the rubric of scientific method. 

The structure of factual knowledge has been explicated in our general 
discussion of models and theories and our detailed analysis of classical 
mechanics. Our aim in this section is to explicate the structure of procedural 
scientific knowledge. The subject is complex, so we cannot hope to produce 
much more than an outline. We will do well to identify organizing principles 
which give the subject some coherence. 

The key to an explication of scientific method is recognizing that the central 
activity of scientists is the development and validation of mathematical 
models. Thus we need to analyze the processes of mathematical modeling. 
We can distinguish two types of modeling process: model development and 
model deployment. The first is concerned primarily with theoretical aspects of 
modeling, while the second is concerned with empirical aspects. Theoretical 
and empirical aspects are often interrelated, so the distinction between 
development and deployment is a matter of emphasis rather than sharp 
separation. Let us proceed to a discussion of each process in turn. 


Model Development 


A model is a surrogate object; it depicts or portrays a real object by 
representing its properties. The properties of a real object are known only 
through their representation in a model; they are never experienced directly. 
Moreover, our knowledge of any real object is always incomplete. Every 
model is an idealization or partial representation of its referent, which is to say 
that some but not all properties are represented in the model. Nevertheless, 
physicists strive to construct complete models of the most elementary consti- 
tuents of matter, such as electrons. (These are the only objects that might be 
simple enough to model in all detail — but that is pure speculation). 

Deliberate idealization is a method of simplification. A model which fails to 
represent known properties of its referent is often useful when those proper- 
ties are regarded as irrelevant or uninteresting. Thus, we model the Earth as a 
particle when concerned with its motion in the solar system. 

The method of deliberate idealization generalizes to the method of succes- 
sive refinements, which is one of the major modeling strategies in science. 
Beginning with a simple model, a sequence of increasingly complex models is 
constructed by successively incorporating additional attributes to represent 
the object with increasing detail. Thus, the simple particle model of the Earth 
is refined by modeling it as a rigid body to describe its rotation, further refined 
by modeling it as an elastic solid to account for the effects of tidal forces; then 
it may be assigned a model atmosphere and molten core to account for its 
thermal properties. The modeling is never finished, as any geophysicist or 
climatologist can attest. 


Modeling Processes 597 


Figure 4.1 MODEL DEVELOPMENT 
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The process of developing a mathematical model can be analyzed into four 
essential stages: (1) Description, (Il) Formulation, (Il) Ramification, and 
(IV) Validation. The stages are implemented consecutively, though back- 
tracking to revise the results of an earlier stage is not uncommon. The entire 
model development process is outlined schematically in Figure 4.1 to indicate 
the kind of information processing in each stage. The figure can be regarded 
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as the outline of a modeling strategy as well as a description of the modeling 
process. Moreover, it can be regarded as a problem solving strategy, since, by 
and large, physics problems are solved by developing models. 

The modeling strategy outlined in Figure 4.1 is sufficiently general to 
apply to any branch of physics, indeed, to any branch of science. Therefore 
it can be regarded as a general scientific method. However, the implemen- 
tation of each stage in a particular model is theory-specific, that is, the tactical 
details in modeling vary from theory to theory. To understand how the 
strategy applies to mechanics, we need to elaborate on the details of each 
modeling stage. 

(I) The Description Stage begins with a choice of objects and properties to 
be modeled. The theory to be used in modeling depends on the kinds of 
property to be modeled — physical, chemical or biological, for example. When 
an appropriate theory has been chosen, the theory provides a system of 
principles which constrain and direct the modeling process. 

Object description is the first step in modeling. The object description 
begins with a decision on the type of model to be developed. For example, a 
given solid object could be modeled either as a material particle, or as a rigid 
body. Mechanics provides subtheories to facilitate the modeling of objects of 
each type. Complex objects are modeled as composite systems of interacting 
parts, for example, a system of particles or rigid bodies. In that case, the 
object description must specify the composition of the system and the model 
type of each part. Each part can then be modeled separately, and the model 
for the whole system is determined by the way the interacting parts are 
assembled. 

In a process description the state variables of the model are specified. The 
state variables may be either basic or derived. Basic variables are defined 
implicitly by the generic laws (including the Zeroth Law). Derived variables 
must be defined explicitly in terms of basic variables. In mechanics, position 
and velocity are basic variables, while momentum, kinetic energy and angular 
momentum are derived variables. A process description necessarily employs 
the Zeroth Law, so some reference system must be adopted, even if it is not 
mentioned explicitly. 

A process is defined as the time evolution of some set of state variables. 
Motion is the basic process in mechanics. The energy conservation law makes 
it convenient, sometimes, to consider the process of energy flow indepen- 
dently of the objects processing the energy. In such a case, one is modeling a 
process rather than an object. A process model omits reference to objects 
underlying the process. 

Graphical or diagrammatic methods are often useful in a process descrip- 
tion. As a rule, only a qualitative graph of the process can be made in the 
description stage of modeling; although a few points, such as initial and final 
States may be specified completely. A quantitative graph is usually possible in 
the ramification stage of modeling. 
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An interaction description specifies the interaction type and agent for all 
interactions in the model, along with appropriate interaction variables, basic 
or derived. This includes internal interactions among the parts of a composite 
system, as well as interactions with external agents. The interaction descrip- 
tion must be coordinated with the process description; a consistent set of 
variables must be chosen, and any changes in interactions between different 
stages of the process must be indicated. In mechanics, for example, use of 
kinetic energy as a State variable calls for use of potential energy and work as 
interaction variables. And the description of interactions differs in the proces- 
ses of projectile motion and collision. 

To sum up, the descriptive stage produces complete lists of object names 
and descriptive variables for the model and supplies the model with a physical 
interpretation by providing referential meanings for the variables. 

(II) In The Formulation Stage, the laws of dynamics and interaction are 
applied to get definite equations of change for the state variables. Within a 
given theory, the appropriate choice of laws depends on the type of model 
amd descriptive variables, as is clear in examples from mechanics. In a particle 
model, Newton’s Second Law is the dynamical law relating basic descriptive 
variables, but conservation laws for energy, momentum and angular momen- 
tum may be more appropriate when derived variables are used. For a system 
of particles with interactions described by equations of constraint, we have 
seen that Lagrange’s equation is the most convenient dynamical law. In a 
rigid body model, we employ separate dynamical laws for translational and 
rotational motion of the body. These laws belong to the superstructure of 
mechanics, being derived from the basic laws nd the definition of a rigid body. 
The derivation is a special exercise in model formulation which can be carried 
out once and for all. The results can then be applied directly to the formula- 
tion of any rigid body model. 

Besides equations of change, a model may include equations of constraint 
(as indicated in Figure 4.1). The equations of constraint in a model are 
functional relations among descriptive variables (rather than differential 
equations). There are many different kinds, including the so-called Constitu- 
tive Relations or Equations of State, such as the ideal gas law (PV = nRT) in 
thermodynamics and fluid mechanics. 

Implementation of the formulation stage produces an abstract model object 
consisting of the set of descriptive variables and equations of change and 
constraint sufficient to determine values of the state variables. The adjective 
“abstract” signifies that in an abstract model the descriptive variables are 
detached from the-referential meanings determined in the descriptive stage. 
Thus, the descriptive variables in an abstract model describe nothing in 
particular. The adjective ‘descriptive’ remains appropriate, however, be- 
cause in principle a descriptive variable can always be interpreted by asso- 
ciating it with a referent. 

An abstract model does not represent a particular object. A model of a 
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particular object consists of an abstract model together with an interpretation 
of its descriptive variables; in brief, a concrete model is an interpreted abstract 
model. The detachment of an abstract model from any physical interpretation 
is a step of major scientific and psychological importance. For the abstract 
model takes on a theoretical life of its own which can be studied apart from 
the complexities of a real physical situation. This process of model abstraction 
is crucial to scientific understanding. Paradoxically, physical insight into a 
given physical situation is achieved by sharply separating the perceived situ- 
ation from its conceptual representation,that is, by constructing an abstract 
model. The physicist uses the same abstract model of a particle subject to a 
constant force to represent many different physical situations, such as a falling 
body or a body sliding on a rough surface. Thus, the model abstraction 
process enables the physicist to recognize common elements in different 
physical situations. Undoubtedly, it plays a role in the discovery of general 
physical laws from ad hoc models constructed wthout the help of general 
laws. 

(III) In The Ramification Stage the special properties and implications of 
the abstract model are worked out. The equations of change are solved to 
determine trajectories of the state variables with various initial conditions; 
the time dependence of significant derived descriptors, such as energy, is 
determined; results may be represented graphically as well as analytically to 
. facilitate analysis. Let us refer to a model object together with one or more of 
its main ramifications as a ramified model. 

The ramification process is largely mathematical, but the analysis of results 
is just as important. Especially important is the identification of emergent 
properties in composite systems, such as resonances, stabilities and instabili- 
ties. 

A large part of this book has been devoted to ramifications. We were able 
to work out ramifications of the gravitational two body problem at length, 
because the equations of motion can be solved exactly. On the other hand, we 
found that the ramifications of the gravitational three body problem are only 
partially known. 

(IV) The Validation Stage is concerned with evaluating the ramified model 
by comparing it with some real object-in-situation which it is supposed to 
describe. This may range from a simple check on the reasonableness of 
numerical results to a full-blown experiment test. Validation is a model 
deployment process, so we will see it in perspective as we analyze model 
deployment. 


Model Deployment 


Model deployment is the process of matching a ramified model to a specific 
empirical situation. The result is a concrete model that represents objects 
and/or processes in that situation. We say, then, that the situation has been 
modeled by the scientific theory from which the model was developed. The 
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match of the model to the situation is a correspondence between the values of 
descriptive variables in the model and properties of objects in the situation. 
The correspondence is established by measurement procedures, including the 
so-called operational referitions for the variables measured. Measurement 
involves error and uncertainty, so the match of model to situation must be 
characterized by some measure for ‘‘goodness of fit’? and criteria for an 
adequate match must be set up. These issues are handled by a theory of 
measurement, which can be regarded as one part of a general theory of model 
deployment. We mention measurement here only to indicate how it fits into 
modeling theory. 

Different kinds of model deployment can be classified according to differ- 
ent purposes they subserve. A model may be deployed for the purposes of 
scientific explanation, prediction or design. Indeed, we say that an empirical 
phenomenon can be explained scientifically if and only if it can be adequately 
modeled by a scientific theory. Scientific predictions are generated by process 
models which relate the values of property variables at different times. 
Scientific design involves the development of models to be deployed as plans 
for the construction of physical systems with specified properties. 

The assertion that scientific explanation is a kind of model deployment 
deserves some further comment, since scientific explanation is not ordinarily 
characterized that way. There are two common kinds of scientific explanation: 
causal and inferential. A causal explanation of an event A is supplied by 
identifying its cause, consisting of agents and conditions sufficient to produce 
A. An inferential explanation of A is supplied by identifying a mechanism (or 
law) which accounts for A. Each kind of explanation employs one of the 
essential ingredients of a model. Thus they employ partial models and should 
be regarded as partial explanations only. A complete explanation requires a 
complete model. 

Empirical tests of a scientific theory are variants of the three major kinds of 
model deployment we have just discussed. A theory can be tested only 
indirectly by testing for the empirical adequacy of models developed from the 
theory. A particular hypothesis can be tested only as part of a theory which is 
sufficient for the design of testable models and only against an alternative 
hypothesis which is a candidate to replace it. A test is made by comparing the 
adequacies of models generated with the alternative hypotheses. 

This discussion of model deployment was necessarily brief, because a 
systematic theory of deployment processes is yet to be developed. The subject 
is complex, but the concept of model deployment appears to be the thread 
needed to tie up a lot of loose ends in the methodology of science. 


Exercises for Chapter 9 


1. Does the Zeroth Law imply the existence of a unique physical entity 
which we might identify as physical space? 
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Are space and time objectively real in the sense that they exist indepen- 
dently of any human mind? 
Develop an explicit formulation of a Law of Molecular Composition, 
providing suitable definitions for the key terms, and carefully distin- 
guishing between mathematical structure and physical interpretation. 
Discuss the scope and validity of the law. 
Design a thought experiment for determining if a given reference system 
is an inertial system. 
According to the Biot-Savart Law, a moving particle with charge q, 
produces a magnetic field 

B(x, ¢ _ h VX (K-%) . 

Go iixox, 

where c is a constant, x, = x,(t), and v, = v,(t) is the velocity of the 
particle. Examine the magnetic interaction between two charged par- 
ticles and show that the Law of Reciprocity is not satisfied. How is this 
result affected by including electric interactions? Evaluate the rate at 
which this two particle system transfers momentum to the electromag- 
netic field. What if one particle is initially at rest? 
Suppose that during all of recorded history the earth was surrounded by a 
dense cloud cover so that the sun, moon and stars could not be seen. 
Suppose also that Newtonian mechanics had developed in spite of this 
handicap. Explain how earthbound physicists could nevertheless detect 
the rotation of the earth and the orbital motion about the sun and thus 
separate the associated pseudoforces from real forces. 
Examine the change in form of the Second Law induced by changing to a 


‘time variable which is an arbitrary monotonic function of inertial time. 


Discuss the change in form of equations of motion when transformed 
from an inertial system to an accelerated reference system. 
Discuss the following assertion by J. L. Synge: 


“Tt is futile to ask whether nature is ultimately discrete or continuous, for ‘‘discrete” and 
“continuous” are categories of the understanding, not properties of nature”’. 


Make a thorough critique of Eisenbud’s influential article on mechanics 
(below), comparing it in detail with the formulation of mechanics in this 
chapter. Note how the concept of definition is used. Carefully distinguish 
between explicit and implicit definitions, interpretations, correspon- 
dence rules and measurements. 

L. Eisenbud, ‘On the Classical Laws of Motion’, Am. J. Phys. 26, 
144-159 (1958). 


Appendix A 


Spherical Trigonometry 


In Section 2-4 we saw how efficiently geometric algebra describes relations 
among directions and angles in a plane, here we turn to the study of such 
relations in 3-dimensional space. Our aim is to see how the traditional 
subjects of solid geometry and spherical trigonometry can best be handled 
with geometric algebra. Spherical trigonometry is useful in subjects as diverse 
as crystalography and celestial navigation. 

First, let us see how to determine the angle that a line makes with a plane 
from the directions of the line and the plane. The direction of a given line is 
represented by a unit vector a, while that of a plane is represented by a unit 
bivector A. The angle @ between 4 and A is defined by the product 


aA = @ cosa + isina = de’ (A.1) 


where 7 is the unit righthanded trivector, & is a unit vector and a = a&. The 
vector and trivector parts of (A.2) are 


a-A = acosa, (A.2a) 
aaA = isina. (A.2b) 


Our assumption that / is the unit righthanded trivector fixes the sign of sin a, 
so, by (A.2b), sin a is positive (negative) when aaA is righthanded (left- 
handed). The angle a is uniquely determined by (A.1) if it is restricted to the 
range 0 < a S 2z. 

Equations (A.2a, b) are perfectly consistent with the conventional in- 
terpretation of cos @ and sin a as components of projection and rejection, as 
shown in Figure A.1. Thus, according to the definition of projection by Equation 
(4.5b) of Section 2-4, we have 


P,(a)A = aA = @ cos a. 


We can interpret this equation as follows. First the unit vector 4 is projected 
into a vector P;(a) with magnitude cos a. Right multiplication by A then 
rotates the projected vector through a right angle into the vector a-A, which 
can be expressed as a unit vector q times its magnitude cos a. Thus, the 
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product 4-A is equivalent to a projection of a into the A-plane followed by a 
rotation by +7. 

To interpret the exponential e in (A-1), multiply (A.1) by 4 to get 

A = (ae*)@. 

This expresses the unit bivector A as the 
product of orthogonal unit vectors 4e’* and @. 
The factor e’ has rotated 4 into the A-plane. 
We recognize e‘* as a spinor which rotates 
vectors in the (@)-plane through an angle a. 
The unit vector @ specifies the axis of rota- 
tion. 

Now let us consider how to represent the 
angle between two planes algebraically. The [ig 4.1. The angle between a vec- 
directions of two given planes are represented tO ee 


by unit bivectors A and B. We define the dihedral angle c between A and B by 
the equation 


BAe’. (A.3) 


Both the magnitude of the angle c = |e| (with 0 < c < 27) and the direction 
of ié of its plane are determined by (A.3). Separating (A.3) into scalar and 
bivector parts, we have 


(B A), = B-A = cosc (A.4a) 

(BA), = iésinc. (A.4b) 
Note that for c = 0, Equation (A.3) becomes BA = 1, so B = -A, since 
B? = A? = -1, é 


The geometrical interpretation of (A.3) is indicated in 
Figure A.2, which shows the two planes intersecting in a 
line with direction ¢. The vector ¢ is therefore a com- 
mon factor of the bivectors A and B, Consequently 
there exist unit vectors 4 and b orthogonal to é such that 
A and B have the factorizations 


“A 
A 


A = a¢ = —€a 


B=cb = ne Fig. A.2. Dihedral Angle. 
Note that the order of factors corresponds to the orien- Note that the orientation of 
tations assigned in Figure A.2. The common factor é 3 '§ chosen so it opposes 


: “ A ae that of A if B is brought into 
vanishes when B and A are multiplied; caigatiane® withy Aulaye na: 


tating it through the angle 
C. 


BA = (-bé) (-€4) = ba = e*. 


Note that this last equality has the same form as Equation (4.9) with the unit 
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bivector of the (baa)-plane expressed as the dual ié of the unit vector ¢. It 
should be clear now that (A.4b) expresses the fact that <BA), is a bivector 
determining a plane perpendicular to the line of intersection of the B- and 
A-planes and so intersecting these planes at right angles. 


‘ Fig. A.3. A spherical triangle. Fig. A.4, Altitudes of a spherical 
triangle. 

Now we are prepared to analyze the relations among three distinct direc- 
tions. Three unit vectors 4, b, é can be regarded as vertices of a spherical 
triangle on a unit sphere, as shown in Figure A.3. The sides of the triangle are 
arcs of great circles determined by the intersection of the circle with the 
planes determined by each pair of vectors. The sides of the triangle have 
lengths A, B, C which are equal to the angles between the vectors a, b, ¢. This 
relation is completely described by the equations 


ab—e° =casC +CsinC, (A.5a) 
bé=e*=cosA+AsinA, (A.5b) 
é4=e"=cosB+BsinB, (A.5c) 


where A = AA, B = BB, C =CC. The unit bivectors A, B, C are directions 
for the planes determining the sides of the spherical triangle by intersection 
with the sphere. Of course, we have the relations 


AM=be= C* =-1, (A.6a) 
as well as 
2 base = 10 (A.6b) 


The angles a, b, c of the spherical triangle in Figure A.3 are dihedral angles 
between planes, so they are determined by equations of the form (4.3), 
namely, 


BA =e* =cosc + iésinc, (A.7a) 
CB =e" = cosa + iAsina, (A.7b) 
AC =e”=cosb + ibsinb, (A.7c) 
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where a =4a, b = bb, c = éc. 

Equations (A.5a, b, c) and (A.7a, b, c) with (A.6a, b) understood deter- 
mine all the relations among the sides and angles of a spherical triangle. Let 
us see how these equations can be used to derive the fundamental equations of 
spherical trigonometry. Taking the product of Equations (A.5a, b, c) and 
noticing that 


(ab) (bé) (€4) = 1, 
we get 
e®=1. (A.8) 


This equation can be solved for any one of the angles A, B, C in terms of the 
other two. To solve for C, multiply (A.8) by (e°)"' = e~ to get 


e*=e%*e*. (A.9) 


If the exponentials are expanded into scalar and bivector parts and (A.7a) is 
used in the form AB = (BA)' = e~“*, then (A.9) assumes the expanded form 


cos C - C sin C = (cos A + A sin A)(cos B + B sin B) 
= cos A cos B + Asin A cos B + Bsin Bcos A 
+ (cos c-i€ sinc) sin A sin B. 
Separating this into scalar and bivector parts, we get 


cos C=cosA cos B+ sinA sin Bcosc, (A.10) 


—€sin C = Asin A cos B+ Bsin Bcos A 
-—iésinA sin B. (A.11) 


Equation (A.10) is called the cosine law for sides in spherical trigonometry. It 
relates three sides and an angle of a spherical triangle and determines any one 
of these quantities when the other two are known. 

Since the value of C can be determined from A and B by (A.10), the 
direction C is then determined by (A.11). Thus equations (A.10) and (A.11) 
together determine C from A and B. Nothing resembling Equation (A. 11) 
appears in traditional spherical trigonometry, because it relates directions A, 
B, C, whereas the traditional theory:i is concerned only with scalar relations. 
Of course, all sorts of scalar relations can be generated from (A.11) by 
multiplying by any one of the available bivectors, but they are only of 
marginal interest. The great value of (A.11) is evident in our study of 
rotations in 3 dimensions in Section 5-3. 

We can analyze consequences of (A.7a, b, c) in the same way we analyzed 
(A.5a, b, c). Observing that 


(BA) (AC) (CB) = -1, 
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we obtain from the product of Equations (A.7a, c, b) 


eee =-1, (A.12) 
This should be compared with (A.8). From (A.9) we get 
Cums ere. (A.13) 


The scalar part of this equation gives us 
cos c = -—cosacos b + sinasin bcos C. (A.14) 


This is called the cosine law for angles in spherical trigonometry. Obviously, it 
determines the relation among three angles and a side of the spherical 
triangle. 

The cosine law was derived by considering products of vectors in pairs, so 
we may expect to find a different ‘‘law” by considering the product 4aabaé. 
Inserting baé =A sin A from (A.5b) and the corresponding relations from 
(A.5c, a) into Aabaé, we get 


aabaé = 4aA sin A = baB sin B = éaC sinC. (A.15) 


We can find an analogous relation from the product ABC. Using (A.7b), we 
ascertain that 


<ABC), = —<Aia), sin a = -i4aA sin a. 


Obtaining the corresponding relations from (A.7a, b), we get 


iKABO), = aaA sina = baBsin b = éaC sinc. (A.16) 
The ratio of (A.15) to (A.16) gives us 
aa bac an ane _ a B = a (G (A.17) 
iKABC), sin a sin b sin c 


This is called the sine /aw in spherical trigonometry. Obviously, it relates any 
two sides of a spherical triangle to the two opposing angles. 

We get further information about the spherical triangle by considering the 
product of each vector with the bivector of the opposing side. Each product 
has the form of Equation (A.1) which corresponds to Figure A.1. Thus, we 
have the equation 


aA = &e*= &cosatisina, (A.18a) 
bB = fe” = BcosB + isinB, (A.18b) 
éC = fe’? = ¥cos y + isin y. (A.18c) 


The angles ia, if, iy are “altitudes” of the spherical triangle with lengths a, £, 
y, as shown by Figure A.4. If the trivector parts of (A.18a, b, c) are 
substituted into (A.15) and (A.16) we get the corresponding equations of 
traditional spherical trigonometry: 
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aabaé 


= sinasin A = sin Bsin B = sinysinC, (A.19) 
i 


<ABC), = sin asin a = sin Bsin b = sin ysinc. (A.20) 


This completes our algebraic analysis of basic relations in spherical trigo- 
nometry. 


Exercises 


(A.1) Prove that Equation (A.8) is equivalent to the equations 
e~ete® = 1 “and ¢“e*%e* = 1. 


(A.2) The spherical triangle (a, b, é) satisfying Equations (A.5) and (A.7) 
determines another spherical triangle (a’, b’, é’) by the duality 
relations 


A a A 


A = ia’, B= ib’, C=’. 


The triangle (a’, b’, é’) is called the polar triangle of the triangle (a, 

b, é), because its sides are arcs of great circles with a, b, and é€ as 

poles. . 

Prove that the sides A’, B’, C’ of the polar triangle are equal to 
the exterior angles supplementary to the interior angles a, B, y of 
the primary triangle, and, conversely, that the sides A, B, C of the 
primary triangle are equal to the exterior angles supplementary to 
the interior angles a’, B’, y’ of the polar triangle. 

From Equations (A.18a, b, c) prove that corresponding altitudes 
of the two triangles lie on the same great circle and that the distance 
along the great circle between 4 and 4’ is |a — +7]. 

(A.3) For the right spherical triangle with c = 7/2, prove that 


; sin A : sin B 
sina= - , snb= -— ‘ 
sin C sin C 
COs a cos b 
cos A = — COS: 6b =a : 
nb sin a 


COS G—"COstArcoseae— cofarcotD: 


(A.4) Prove that an equilateral spherical triangle is equiangular with angle 
a related to side A by 


cos a—cos A + cosacosA = 0. 
(A.5) — Assuming (A.5a, b, c), prove that 


2anbacé = e®e* — e*e® = 2ié sinc sin A sin B. 
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(A.6) 


(A.7) 


(A.8) 


(A.9) 


Hence, 


anbaé ’ . : : A . 
; = sin A sin B sinc = sin A sin b sin C 


= sina sin B sin C. 


Use this to derive the sine law, and, from (A.19), expressions for 
the altitudes of a spherical triangle. 
Prove that 


|<abe),|? = (a:b)? + (b-c)? + (c-a)? — 2a-b bee eva. 
Assuming (A.5a, b, c) use the identity 
abc = aanbac + abc), 


to prove that 
|aabaé|? = 1 — cos* A - cos? B— cos? C + 2 cos A cos B cos C. 


Note that this can be used with (A.19) to find the altitudes of a 
spherical triangle from the sides. 

Find the surface area and volume of a parallelopiped with edges of 
lengths a, b, c and face angles A, B, C < +7. 

Establish the identity 


(anb):(cad) + (bac)-(aad) + (caa)-(bad) = 0. 


Note that the three terms differ only by a cyclic permutation of the 
first three vectors. Use this identity to prove that the altitudes of a 
spherical triangle intersect in a point (compare with Exercise 
2-4.11a). 

Prove that on a unit sphere the area A of a spherical triangle with 
interior angles a, b, c (Figure A.3) is given by the formula 


A=at+b+c-rn. | 
Since A is given by the difference 


between the sum of interior 

angles for a spherical triangle 

and a plane triangle, it is often 

called the spherical excess. 

Hint: The triangle is determined In 
by the intersection of great cir- 

cles which divide the sphere into 

several regions with area A, A,, 

A, or A, as shown in Figure A.5. 

What relation exists between the 


angle a and the area A + A,? Fig. A.5. Spherical excess. 


Appendix B 


Elliptic Functions 


Elliptic functions provide general solutions of differential equations with the 
form 


(2) = fo, B.1) 


where f(y) is a polynomial in y = y(x). Such equations are very common in 
physics, arising frequently from energy integrals where the left side of the 
equation comes from a kinetic energy term. 

Since different polynomials can be related by such devices as factoring and 
change of variables, it turns out that the general problem of solving (B.1) for 
a large class of polynomials can be reduced to solving a differential equation 
of the standard form 


al Pl aetr e beee 
(2 Gy JU ky). (B.2) 
for 0 <k <1 and-1 < y <1. The solution of this equation for the condi- 


tions 


Wel. 2 >0 when x = 0, (B.3) 


is denoted by 
y=snx (B.4). 


(Pronounced “‘ess-en-ex’’). Of course, this function depends on the value of 
the parameter k, which is called the modulus. 
Direct integration of (B.2) produces the inverse function 


eee, oy 
laid Acero (ero ae a 


610 


Elliptical Functions 611 


It is an odd function of y, which increases steadily from 0 to 


% -|' dy 
e[(l—y Gl - ky I 


as y increases from 0 to 1. Consequently, y = sn x is an odd function of x, and 
it has period 4K; that is 


(B.6) 


sn(x + 4k) = sn x. (B.7) 


The integral (B.6) is called a complete elliptic integral of the first kind. We can 
evaluate it by a change of variables and a series expansion: 


m/2 du a2 Ke 
K(k = Cl ies in? Sains 
“) [: (1 — k? sin? wy? I. | aia | ew 
= 7 ile) Perr 2 ice! 


oO Ps 
1 + is 2n 
2 Z| 24... 2 : ae 


The function K(k) is graphed in Figure B.2. 
Two other functions cn x and dn x can be defined by the equations 


Civ] —sn xen 0 = 1, K 
aney = Loe-sn vy, dn 0)= 1% 
(B.9) > 


along with the condition that their deriva- 4 

tives be continuous to determine the sign 

of the square root. Since k <1, dnxis 3 

always positive with period 2K, while cn x 

has period 4K. 2 
The three functions sn x, cn x and dn x 

are called Jacobian elliptic functions, or 1 

just elliptic functions. They may be re- 

garded as generalizations or distortions of (ne ni, 

the familiar trigonometric functions. In- iS. 5.1. <GiemNeatepenoueseanENe: 

deed, from the above relations it is read- tion of the modulus k. 

ily verified that for k = 0, 


snx=sinx, cnx=cosx, K=+a2, (B.10) 
and for k = 1, 
snx=tanhx, cnx=dnx=sechx, K=®%. (B.11) 


Traditionally, the nomenclature of elliptic functions is used only when & is in 
range0O<k <1, 

Graphs of the elliptic functions are shown in Figure B.2. Tables of elliptic 
functions can be found in standard references such as Jahnke and Emde 
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(1945), but programs to evaluate elliptic functions on a computer are not 
difficult to write, and some are available commercially. 


Fig. B.2. Graphs of the elliptic functions sn x, cn x, dn x for k* = 0.7. 


For applications we need some systematic procedures for reducing equa- 
tions to the standard form (B.2). Consider the equation 


2 
{22} = Ar‘ + Br? + C+ Dr, (B.12) 


where A, B, C and D are given scalar constants. This can be reduced to 
standard form by the change of variables 


r>=ay’+b, where y=sn(px) (B.13) 


and a, b, uw, are constants. To perform the reduction and determine the 
constants, we differentiate (B.13) to get 


2( dr . 2 (2) 
r (=| = ae (B.14) 
The left side of this equation can be expressed in terms of y* by substituting 


(B.12) and (B.13), while the right side can be expressed in terms of y’ by 
using 


(2) = w(1-y)(1- Ky’). (B15) 
Then, by equating coefficients of like powers in y, we obtain 

Ab’ 4 Bb’ + Cho D = 

3b°A +268 + C= ar 

3bA + B= —p7(1 + k?) 

aA = wk’. 


When these four equations are solved for the four unknowns b, a, wand k’, 
the solution to (B.12) is given explicitly by (B.13). To prepare for this, we 
eliminate k* from the third equation and uw from the second, putting the 
equations in the form 


Ab? + Bb? +bC+D=0 (B.16a) 
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Aa’ + (3bA + B)a + (362A + 2bB + C)=0 (B.16b) 

Ww =-A(3b + a)-B>0 (B.16c) 

ge ee (B.16d) 
2 


After the cubic equation (B.16a) has been solved for B, the quadratic 
equation (B.16b) can be solved for a. Then the values of a and b can be used 
to evaluate w’ and k?’. 

The theory of elliptic functions is rich and complex, a powerful tool for 
mathematical physics. We have discussed only some simpler aspects of theory 
needed for applications in the text. 


Exercises 
(B.1) Establish the derivatives 


Le ae 
dx 


Seay oon dae 
dx 


dn x = -k*? sn x cn x. 
(By) Show that y = cn(ut) andr = dn(ut) are solutions of the differential 
equations 
Vat ay ea ky) 
Peete) 
where k’? = 1-k?. 


(B.3) Find a change of variables that transforms 


Fal = Aw + Ba’? + Cw’ + Da 


into an equation of the form (B.12). 


Appendix C 


Tables of Units, Constants and Data 


C-1. Units and Conversion Factors 


Length 
1 kilometer (km) = 10° meter (m) 
1 angstrom (A) =10"°m 
1 fermi =10"m 
1 light-year = 9.460 x 10° m 


1 astronomical unit (AU) =1.49596 x 10" m 


Time, frequency 

1 sidereal day = 24 x 60 x 60 sidereal seconds (s) 
= 0.997 269 57 mean solar days (d) 
= 23" 56™ 4°.091 mean solar time 


1 mean solar day = 1.002 737 91 sidereal days 

1 sidereal year = 365.256 36 d 

1 sidereal month = 27.321 661 0d 

1 Hertz = 1 vibration (or cycle) per second 


Force, Energy, Power 


1 newton (N) = 1 kg-m/s* = 10° dynes 
1 joule (J) = 1Nm = 10’ ergs 
= 6.242 x 10" electron-volts (eV) 
1 MeV = 10° eV 
1 watt =1 J/s 


Magnetic Field 
1 tesla = 1 Weber/m? = 10° gauss 


C-2. Physical Constants 
Gravitational constant G =6.668 x 10°! Nm7/kg? 


Speed of light c =2.99791 x 10° m/s 
Electron mass m, = 9.1096 x 10°' kg = 0.511 MeV/c? 
Proton mass m, = 1.6725 x 10°’ kg = 938.3 MeV/c? 
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Neutron mass 
Electron charge e 


C-3. The Earth 

Mass 

Equitorial radius 

Polar radius 

Flattening 

Principal moments of inertia 
Polar 
Equitorial 
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m, = 1.6748 x 107’ kg = 939.6 MeV/c’? 
= 1,602 x 10°” Coulombs 


Mz = 5.976(4) x 10° kg 
a =6.37816(4) x 10° m 
c = 6.35677(9) x 10° m 


(a—c)/a = 1/298.25 x 3.2529 + 10° 


1, = 0.3306 Mga? 
1, = 0.3295 Mga? 


(1, — 1/1, = 1/305.3 = 3.276 x 10° 


Inclination of equator 
Length of year (Julian) 


C-4. The Sun 

Solar mass 

Solar radius 

Solar luminosity 

Mean Earth-Sun distance 


C-5. The Moon 
Lunar mass 
Lunar radius 
Inclination of lunar equator 
to ecliptic 
to orbit 
Mean Earth-Moon distance 
Eccentricity of orbit 
Sidereal period 


C-6. The Planets 


Planet Mass Sidereal 
Period 
Mercury 0.0554 87.97d 
Venus 0.815 224.70 
Earth 1.000 365.256 
Mars 0.1075 686.98 
Jupiter 317.83 4332.59 
Saturn 95.147 10759.22 
Uranus 14.54 30685.4 
Neptune 228) 60189.0 
Pluto 0.17 90465.0 


= 23° 27’ 
= 365.25 days 


Mo = 1.989(2) X 10” kg 

Ro = 6.9598(7) x 10° m 

Lo= 3.90(4) x 10°° Joule-sec 
= 1 AU = 1.49596 x 10'' m 


M¢= Mo/81.301 x 7.350 = 107 kg 
R¢= 1.738 X 10° m 


re la Vest 
=e 

fc = 3.84 x 10° m 
= 0.05490 
= 27.321 661 d 


Semi-major axis 


AU (10° km) Of orbit to ecliptic 
0.38710 RY 0.2056 700° 
0.72333 108.2 0.0068 3.39 
1.00000 149.6 0.0167 ~ 
1.52369 22nd 0.0934 1.85 
5.20280 778.3 0.0485 1.31 
9.53884 1427.0 0.0557 2.49 

19.1819 2869.6 (0.0472 0.77 
30.0578 4496.6 0.0086 Nethil 
39.44 5900.0 0.250 17.16 


Eccentricity Inclination 


Hints and Solutions for Selected 


EO MerCises 
‘““An expert is someone who has made all the mistakes” 
H. Bethe 
“Therefore we should strive to make mistakes as fast as 
possible” 
J. Wheeler 
Section 1-7. 
(Fate) A+B =A-+C (Given) 
(-A) + (A + B) = (-A) + (A + C) (Addition Property) 
(-A) + A+ B= (-A)+A+C _ — (Associativity) 
U+B=0+C (Additive Inverse) 
B= C (Additive Identity) 
(ld). AB= BC, ANAB) =A AG), (A Ale (A ANG. 1he-wc. 
B=C. 
(72a) (@-a)* = =p. >. sea . Bee 
Oe ata a-a aoa 
This is undefined if a = a’. 
G26) lia = a>, then Z : a= +(1 + 4) is idempotent. 
(7.2c) IfR = AN,then AR = R. Soif RR" = 1,thenARR* = RR'=A 
which is a contradiction. 
(72d) WP? =Pand PP *=1, then PP'= PP'= P. Hence P = 1, 
Section 2-1. 
(1.8 (anb):(cad) = a-(b-(cad)) = a-(b-c d—c b-d) = b-c ad —- ae b-d 
= (anb cad), = <abcad), — <beaba), 
= b-[(ead)-a] = [b-(cad)]-a. 
Note that the ambiguity in writing b-(cwd)-a is inconsequential. 
(1.2) If aa+ Bb + ye=0 and a #0, then baca(aa + fb + ye) = 
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(1.3) 


(1.4) 


(1.5) 
(1.6) 


(1.7) 


(1.8) 


en} 
(10s 


(1.13) 


(1.14) 


(1.15) 


(1.19) 


abacaa = 0, so anbac = 0. 
If aabac = 0 and aab # 0, then, from Exercise (1.1), 
(aab)’c — (anb)-(aac)b + (anb):(bac)a = 0. 


a cba 


<——— ——— - 3 ; 
a ala+ab) provided the denominator does not vanish. 


First show that axaAB = aaB. Then use xB = x-B + xaB to get 


oa + aB-a— aaBB 


= (a + a'aaB) (a + B)' = 
x = (a + a'anB) (a ) ata? +] Bf) 


fat o¢B 
ca , 


ab’c = b*(a-e + aac) = (ab + anb) (b-c + bac). 
Separate scalar and bivector parts. 


(anbac):(uAvaw) = (aab)-[c-(uAvaw)| 
= c-u(anb):(vaw) — ¢-v(aAb):(uaw) + 
c:w(anb)-(uay). 
<abuv>, = abu-v + (anb)-(uavy). 
<abcuvw), = 2a-b<cuvw), — (bacuvw),. 
Repeat the operation until a has been moved to the far right within 


the bracket. Then use abcuvw), = <bcuvwa),, which follows from 
Exercise (2.4). 


One proof uses equations (1.8) and (1.14). 


a, ale 
a,a,:--a,)' = a;'---aj'a)) = + 
( 12 Bs) r 2 1 a?---asae 
(AD! = a,--a,a, = —a,---a,a,a, = (-1)’a,:--a,a,a,a, 


II 


Sa (aly 'a,a,"*'a, = (-1)'' (le "a,a,a,*"'a, 
- (-1)°?""a,a,--a,. 


Use (1.24a, b) and Exercise (1.13) as follows, 

(AB), = (B'A'), = X(BNKA'),)y = 2XKB), (A))o = (BADo 

If aabacad = 0, then there exist scalars a, 8, ysuch that d = aa + Pb 
+ ye. Hence aab + cad = anb + acaa + Peab = (a + Be)a(b - 
ac) = uavy. If aab + cad = uav, then 

(aab + cad)a(aab + cad) = 2aabacad = uavauavy = 0. 

AB >A) -CBD ANB DA) ACB).: 

Ti@2) and (2.2) were allowed to apply to the case r = lands = 0, 
we would have a:A = <ad),, = Aaand aad = (ad), ,, = Aa, which is 
inconsistent with aA = a-A + aad. 
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Section 2-3. 


(3.1a) 


(3.1b) 


(3.2) 


(3.3) 


(3.4) 


(4) 


G10) 


By (2.5a), xAB = +(xB + Bx) = 0, 
or by (2.6b), xB = x-B = -B-x = —Bx. 
x’ = xB = xB = (xB), , a vector. 
ix’? = x"? = XBNB — x(-xB)B — x BB |x/|Bi:, 
xx’ = xx’ + xax’ = xB 

implies x-x’ = 0, 

and also xax’ = x iB. 
a —iaab = a,6, aab = (6,A4¢,):(aab) 

= ¢,'a0,-b — o,bo,-a 
i(ab) = iab = a(ib), 
i(a-b + ia X b) = a-(ib) + aa(ib). 
Equate vector and trivector parts separately to get the first two 
identities. 
abac = a‘(bac) + aa(bac) = ai(b X c) 

= i[a:(b X c) + aa(b X c)] 

= la:(b X c)-—a X (b X ¢). 
Equate vector and trivector parts separately. 
abc — cba = a(bac) + (bac)a = 2aa(bac). 
(anb):(cad) = <aabcad), = “ia X bie Xd) 

= a X be X d), = -(a X b)-(c X d). 

ab-(c X d) = a-be X d—a-cb X d+ a-db Xc 
(u X v)a:(b X c) = (u X v)-(a X b)c — (u X v)-(a X c)b + (u X v): 
(b X c)a. 
a X b = (anb)i' = (aab)-(6,A6,A6,) 

= (anb)-(¢,a¢0,)6, — (aab):(6,A0,)¢,+ (aarb):(o,A¢0,)o,. 
Make the identifications i = -i,, j = -i,, k = -i,. Hamilton chose a 
lefthanded basis, in contrast to our choice of a righthanded basis. 
By = (6a0;)-(ib) = (6,Agjio,) b, 

= 10,AG6,0,) dD, = -i6,AG,A 6,b;, 
a X b = -lanb = -laabag,o,. 
The last step follows from 
aabao,o, = (anbAag,):a, 
g,°6,anb — a,:bano, + a,:abaa, 
= 3anb — aab + baa = aab. 

Since a-a, =+(ao, + o;,a), 

6,ao, = o,(-¢,a + 2a:0,) 


Il 


= -—3a + 2a = -a. 
a,anbo, = o,ia x bo, = 10,.a x bo, 
= -ia X b. 


6,10, = 16,0, = 3i. 
This problem is the same as Exercise (2.21), 
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a°’a + aa X b+ bab 


ain ala* + b*) 


Section 2-4. 


(4.5b) (a-b) = (a—b)a‘(a-—b) = (1 — ba)(1 —- ab) 
= Ol fd (1 = ei?) == eae _ Ce 
(4.7) The figure is a regular hexagon with external angle 0 = 22/6 = 2/3; 
there are two other 
(4.8a) | a°b° = abba = (a-b + anb)(a‘b — anb) = (a:b)? — (aab)y 
For ab = e'’, this reduces to cos* 6 + sin? @ = 1. 
(4.8b, c) These identities were proved in Exercise 2.9. Note that the second 
identity admits the simplification (aab):(bac) = aabbac if 
aabac = 0. 
For ab = e“” and be = e’®, the identities reduce to 
cos # sin @ — sin(6 + @) + cos @ sin 6 = 0, 
-sin 6 sin @ = cos(@ + d) — cos 6 cos ®@. 


(4.8d, e) Note that beb = <beb), because bacab = 0. Thus, 
a:(beb) = <abcb>, = a:be-b + (anb)-(cab). 


The desired identities are obtained by adding and subtracting this 
from the identity 


b’a-c = (abbec), = a-bb-c — (anb)-(cab) 
For abbe = e? + %) and abcb = e?~ %, these identities reduce to 


2 sin 0 sin @ = cos(6 + ¢) — cos(6- >) 
2 cos 6 cos @ = cos(@ + p) + cos(@- *). 
(4.9b) Eliminate a-b from c* = a’ + b° + 2a-b 
and 4|A|? = -(aab)? = ab? — (ab)? 
= (ab + a-b)(ab — ab) 
(4.10) (b-a)-(b-c) = (b-a)-(b+ a) =D -a@ = 0. 
(4.1la) Establish and interpret the identity 


(a— b)-(p—c) + (b-c)-(p—a) + (c— a)-(p— b) = 0. 
Note that if any two terms in the identity vanish, the third vanishes 
also. Note that this is an instance of logical transitivity, and that the 
transitivity breaks the symmetry of the relation. 
(4.11b) Establish, interpret and use the identity 


(a—by (q-22")+@-0-(q- 244] 


+(e-a)-(q- £48) =o, 
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Alternatively, one can argue that (q — a)’ = (q — b)’ and (q — b)* = 
(q — c)’ implies (q — a)” = (q- cc)’. Here the transitivity in the argument 
is quite explicit. 

(4.11c) Use the facts that (a — b)-(p—c) = 0, and (a—- b)-(2q-—a-—b) = 0. 
Whence, (a — b):(p + 2q —3r) = 0. © 
What more is needed to conclude that p + 2q — 3r = 0? 


Section 2-5. 


(5.8) a—b = a(1-a'b). 


| 
a-—b 


= (a—b)' = (l-a''b)y'a". 


The quantity z = a''b has the form of a complex number and, as can 
be verified by long division, submits to the binomial expansion 


raltzte +... 
which converges for |z|? = |b|*/lal? < 1. 
Section 2-6. 
(6.1) [(x = a)au]:(¢,A¢,) = AK, a a,) =U (% = a) etc. 


(6.2) (a) Equation (6.2) implies (x — a)u = (x — a)-u = A. 
(6.2) (b) {x} = half line with the direction u and endpoint a. 


(6.3) (a) d = (anb)(b-a)' = (aa =) 
lb — al’ 
(a — c)A(b — a)-(b — a) 
(db). a — =| P,.(a — ¢) 
(6.4) (x — a)A(b — a)a(c — a) = 0. 
(6.5) The solution set is the line of intersection of the A-plane with the 
B-plane. 
_ (b-a)aB or 
uaB 
(6.7) XAU = @AU => XAUAV = aAUAy, 
YAV = DAV => yAvAu = bavau. 
Hence, 


(b — a)AuAy = (y — x)AUAV = dauavy. 


But d:(uav) = 0, so 
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A 


d = (b-a)auav(uay)! 


[(b — a)Au]-(vau) ey 


—b—a- 
Juay|’ 
& [(a — b)av]-(vau) . 
jJuav|? 


(6.8) d = [(a- b)AU]U" 
(6.10) Comparing Figure 6.2 with Figure 6.4b, we see that we can use 
(6.17) to get 


CAC Bi A CB er, 


A, B,+ B,” 2B, Cl Gs "EG; A, + A, ~ 

(6.11) The area of the quadrilateral 0, a, b, ¢ is divided into four parts by 
the diagonals. The theorem can be proved by expressing the div- 
ision ratios in terms of these four area. 

(6.12) An immediate consequence of Equation (6.13). 

(6.13) By Equation (6.12), we may write 


aa + a'a’' = Bb + f'b’ = ye t+ y'c' = 5, 
ata=B+P=yty =1. 


Whence, 
aa — Bb = -a'a' + fb’, 
aap Aa —p:) 
So, ifa-B #0, 
aa — Bb a’a — p’b 


a _ B a’ — B' 
If a— f = 0, the lines are parallel and may be regarded as in- 


tersecting at ©. After deriving similar expressions for p and q we 
can show that 


(6=y)p + (y= a)q + (a= f)r= 0, 


and Exercise (6.12) can be used again. 

(6.14) U = iu implies (x —- a)AU = i(x —a)-u. 

(6.15)  (b-a)A(c —a)a(x — a) = 0, or, to use the special form of Exercise 
(6.14), (x - a):(b- a) X (c- a) = 0. 

(6.16) Expand (aabac)’ = 0. 

(6.17) At the points of intersection r* = (a + Au)’. . 

(6.18) | The equation for the line tangent to the circle at the point d = rd 
can be written x = d + Au = (1 + id/r)d. Its square is x* = r* + A’. 
Evaluate at x = a and solve for d. 
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22h b- i(7/2 — ) 
eos! 
2 sin d 2 sin d 
1-A? ; 2A 
(6.21a) Og a C8, Saat, eS ae ,sin d = 12 F 
(6.22) a, =b,a, = -ae'? — be’? = -(a+ b) cos d + (b-a)-isin d 
(since (b — a)Ai = 0), a, = a, a, = a, = 1, a, = —2 cos @. 


(6.23) (a) Write z=x+iy=(1 + id)”. Then 2 =x°-y’+ 2xyi= 
1+ iA. Hence, x*>- y? =1 and 2xy =A, which describe a 
hyperbola in terms of rectangular coordinates. 

(b) A circle with radius + and center at +-4,. 

(c) The evolvents of the unit circle, i.e. the path traced by a point 
on a taut string being unwound from around the unit circle. 

(d) A lemniscate. Note that it can be obtained from the hyperbola 
in (a) by the inversion x > x". 

(6.24) (a) For |x| # 0,4-X = + a’'; this describes a cone with vertex at the 
origin and vertex angle a given by cos +a = a' < 1; it reduces 
to a line when a = 1 and a plane when a = ~. Only zero is a 
solution when a < 1. 

(b) Interior of a half cone for a* > 1. 

(c) Cone with vertex angle +2; symmetry axis and plane of the 
cone. 

(d) <(ax)’), = (ax)? — janx 
totic to the cone in (c). 

(e) Paraboloid. 

(f) For a°>1, b’ =1 and c # 0; circle if aab = 0, ellipse if 
(a-b)’ >1, parabola if (a:b) = 1, hyperbola if (a:b)? < 1. 


2 


= 1 describes a hyperboloid asymp- 


Section 2-7. 


Ga) <(=)-2-4e- wu-uuu  wub-—utU  u(uAn) 
dt \u a on ue 7 ur 7 ur 

(7.26). Ditterntiates?” hab 

(7.3) Generate the exponential series by a Taylor expansion about t = 0, 
and write F(0) = B. Conversely, differentiate the exponential series 
to get F. 


(7.4) (a) Use the fact that the square of a k-blade is a scalar. 
(b) Consider F(t) = e*‘ where A is a constant bivector. 
(7.5) Separate d/dt (rp) = rp + rp into scalar and bivector parts. 
d ae d 
7 (pPAqAr) = pAqar + pa <a (qar) 


= pagar + pagar + pagar 
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(7.7) 


v = uv; d/dt log v’ = 20/v, and according to Equation (7.11), 
dv/dt = v-Q. 


Section 2-8. 


(8.2a) 


(8.2b) 


(8.2£) 


(8.5) 


P= (x—x')-(x-%/) 
a:‘Vr° = a-(x — x’) + (x—x’)-a = 2ar = 2ra-Vr 
a Vr = a V(x — x’) = a-Vx = Hg 
av( =] = i (a°Vr) + ra’V (+ ] 
r li i 


af. era—Ttar. r{rAa) 


—a-r : 
r if iP te 


1 7 
av(—}=av(4]- Le rev (<] 
r Ee ie oe 


; b, oe 
Witea b= —e¢ and2= ax —— ec" . 
a a 


Then z! =x'a= “ei? 
- 
ei”? 
and dz = a' dx = — (dx + xid®@). 
a 
CO 7 da eee. 
oS 
b b 
and | xx =| ae hy oo ee 
a a Xx 0 


This has the scalar part 


{' x-dx I’ dx b 
— —-=log — , 
a >, ¢ a x a 


and the bivector part 


The principal value is obtained by integrating along the straight line 
from a to b or along any curve in the plane which can be con- 
tinuously deformed into that line without passing through the 
origin. If the straight line itself passes through the origin, the 
bivector part of the principal part can be assigned either of the 
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values + iz. If the curve winds about the origin k times, the value 
of the integral differs from the principal part by the amount + 27ki, 
with the positive (negative) sign for counterclockwise (clockwise 
winding. 


Section 3-2. 


2) t, =-vyg' + {(vo-g') + 2r-g '}'’. This expression has the draw- 
back that r and v, are not independent variables. For given v, andr, 
the two roots ¢, are times of flight to the same point by different 
paths. For given v,, they are times of flight to two distinct points on 
the same path equidistant from the vertical maximum. 

(2.4) gar = xgi = VAY, = VAV, U(Ua—- 2gy)'”: Therefore, horizontal 
range x is a maximum for fixed uv, and y when VAv, = i. 

(2.6) Suggestion: Use the Jacobi identity for g, v, r and the fact that the 
vectors are coplanar. 

ee Ai) => [re dt = es v,Agt. 

12 

Section 3-5. 

(1a) v = 63 m/s = 141 mph. 

(1b) v = 89 m/s = 198 mph. 

(1c) v = 4.8 x 10° m/s = 1 X 10’ mph. 

(2) lim 

(3) v = 6.6 m/s = 15 mph = 22 ft/s; 120 m/s. 

(5) The heavy ball beats the light one by 2.2 m and 1/20 sec. 


Section 3-7. 


(7.2) 


C= Zio r= c= rae Ex BY. 


Section 3-8. 


(8.5) 


Z+ wz =0, ; 
% — 2xiw, + w2x = 0, where i = JB, q/Bl 
and the so-called Larmor frequency w, is defined by @, 


general solution 
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r= (ae + ael@)e!" + © cos (aot + d), 


where 
Q=(a+o,)'*?, Ba=0, caB=0. 


The solution can be interpreted as an ellipse with period 27/Q, precessing 
(retrograde) with angular velocity w, while it vibrates along the B direction 
with frequency w,/27 and amplitude | ¢ |. 


Section 4-2. 


47°(60)°R, 
§ 

(2:5) uv = (gR,)'* = 7.91 km/s = 18 000 mph; T = 84.4 min. 

40°(R,+h) ey 


(2.4) T? = 


2.6 Tt? = ;h = 3.58 X 10* km. 
gR- 
(2.7) m,/m, = 3.38 X 10° 

es ~_ @E" ) Sia 
(2.9) (r-a’ =a, mr= a = 


Section 4-3. 


(3.5) v= (2gR,)!2 = 11.2 km/s. = 25 000 mph. 


DG Miah 4 
: =|——="* | -v, = 2.8 km/s = 1.2 X 10° ft/s; 
(3.6) Us ee Up m/s S 
= Tle ae -= 250d 
"(GM an 
(3.9) 6 = 30°. 
(3.10) h=R,e(1-cos +f)/(1 — &); sin? a, = (2 -— vi/gR,)" provided 
pated ee a 
(3.11) For f =e, Lv = mév’a = k(e+1)é, and Exercise (3.8) gives 
Age = — 2aav’é; é/2a(€ + 1) = 24 revolutions 


Section 4-5. 


(5.1) 7? > 0 for all values of r implies 
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r’°V(r) < fle + Er?———> - as 
2m ro0 2m : 
(2) In each case the integral eu be simplified by changing to the 
variable u = r'ortov=r° 
(5.3) Investigate derivatives of the effective potential higher than second 
order. 


Section 4-7. 


Ho sin @ 
(7.4) aes Ve ea 9p cos(@ + @). 
(7.5) (a) E, = m,Q/m. 

(b) K, = m‘(m,K + m,Q + 2Vm,m,QK). 
(7.6) (a) At, = 16 yr, v = 16 km/s, cos a = 1/2. 

(b) Av = 11.7 km/s, 6 = 109°. 

(c) d=0.9R, 

(djve = S554, = 37 ye Ar, = 1.7 ve. 


Section 5-1. 


(iat) XAY = QXAz, So f(xay) = af(xaz) = af(x)af(z). 
(1.2) flax,n .. . AX) = flax, )Af(X,)A. . . ASX) 
= af(x)a... af(%). 


vf(x) vf(y)| _ | f(v)-x f(v)-y 
uf(x) vf(y) f(u)x  f(u)y 


(1.4) (a) f(x) = x;f(o,) =0 only if all x, vanish provided f(e,a 
o,A0,) = 0. . 
(Dix =7 (y)- 

(5) ef (i) = gf(i) = (det g)f (i) = (det g)(det f)i. 

(1.6) Solution from Exercise (2-1.3) 


(1.3) 


= x __x-ba 
a@  a(at+a-b) — 
(7) Solution from Exercise (2-1.4) 


ee arx + ax'B - B xaB 
: ata — B’) 


(1.8) Use Equations (2-1.16) and (2-1.18). 
(L10) (q@ay+. 2. + aa ja(ain... ap. J. Aa,) =Car, - 


= (a + BY" (x + a@'xaB). 
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(1.12) 


Re = CAA. . Ay. eae 
B,A, = B,A, = A,B,,, hence 


aA eG) Na ee aA eee. «Aa 


fk = a;(f'a,) = oaflo,i)/fl) = =(-1)"*'9,a fo, ee eee 
nie, . . H,): 


Section 5-2. 


(2.1) 


(2.3b) 
(2.5) 


(2.6b) 
(207) 


fx = ax + bax + A-x, 

f,.x = ax + +(ab-x + ba-x), 

fx = x-(A + baa). 

ArlOp SV Stay V2 oat oe + V2 a, 

A, = 3, e, = +Ce, + 20, + 2¢,); 

A, = 6,e, = 3 (2¢, ~¢, + 2a,), 

A, = 9, e, = + (20, + 20, -4,). 

A, =a’,4,=6°,14,=-c”’. 

A quadric surface centered a the point a, as described in Exercise 
(2.6) with S = ff. No solution if all eigenvalues of <5 are negative. 


Section 5-3. 


(3.1) 
(3.2) 
(3.3) 
(Cs) 
(3.6) 
(3.7) 


(3.12) 
(3.13) 


(3.15) 


R=". 

Ut =U. 

(-1)° ¢,¢,0,x0,0,0, = — ifxi = —x. 

Rx’ = xR; hence (1 + iy)x’ = x(1 + éy). 

Use the relations RRt = o° + B° = 1 and e*® = R* = ao - B + 2iaf. 
From Exercise (3.6) 

Cj, = Oj, + 1AAG)AG, sina + (AAS): (AAG,)(1 — cos a). 


(a) cosa -sina 0 ; 
| sina cosa QO 
0 0 1 


Tr & = DXe,R's,R), = 4R2- 1 = 4 cos? +a-1 = 1+ 2 cosa. 
Consider the products of a symmetric operator with reflections by 
its eigenvectors. 

Ox = Ax:A: R = e4*. 
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(3.16) Eigenvector o, = fo,; Principal values A, = (1+a°)'’? + a, 
Principle vectors e, = a, + A,o,; 


a Oh. 6,9, 


2 =i =f A 
Re = Ke" (fe,) =e 


— ea no 


whence tan 0 = — a. 


Section 5-4. 


(4.2) {S| b}{R | a}x = {S| b} (RxR + a) = S(RxR + a)S +b 
= (RS)~ x(RS) + SaS + b = {RS | SaS + b}x. 
(4.6) c=a-—b-+ RtbR. 
(4.7)  {R|a}* {1 | b}{R | a} = {1 | RbR} 
(4.8) S,X = —a'(x-—a)a + a = —a''xa + 2a. 


Section 5-5. 


(5.4) Using Equations (3.42a, b, c), 


2R = pio,R + R,O,io,6R, + Rio, 
= i(6,y + R,Q,6,Q)R10 + Ro,R'p)R 
v v v 
(327) (a) @, eae b,@ = Gp (2 »)- 


(b) x=r+a=— (a+b) xXrt+aXb, 
ab 


2 


f= spr 8X (a Xt) + bX (bX x) + 2b X (FX a), 
(ee 
a 
(c) Atr=b, c= Pa xb, =v Cae b.); 
a 


Atr=-b, *=0,%=v'(a'+b°). 


Section 5-6. 


2 

lgao|r-@ wr 

OS tae ee COS Or 
§ 


Qax = 0°6’ at 0 = 45°. 


(6.1) 
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ee eC 


(6.2) 


(6.4) 


(a) Ar = $2? g X w = (1.55 cm) East. 

(b) In an inertial frame, the Easternly velocity of the ball when it is 
released is greater than that of the Earth’s surface. 

(ce) (5 « 10 cin south: 

r,w/r.W, = 0.2%. 


Section 6-1. 


(1.1) 
(1.3) 
(1.6) 


Consider the universe. 
Energy dissipated = mga/6. 
(a) v= v,/(1 + m/M). 
(O)'y = yer". 


Section 6-2. 


(2.1) 


(2.3) 


(2.5) 


m cos? @ ) 
a 9 


a een ae 
oe m+M 


“ : m+M a 
X =—gsinacosa - cos? a | 


(m, + m,)¥ + m,(ld cos o — 1p? sin #) = 0, 
lp + ¥ cos d + mg sin d = 0. 


” ena) der 
Xi) a Soe mei 


Section 6-3. 


G2) 


w, = 2(x/m)'” sin(rz/8) for r = 1, 2, 3. 
q(t) = (1/8)'*A(cos w,t — cos w,t) = q,(t), 
q(t) = +A(cos w,t — cos w,f). 


Section 6-4. 


(4.1) 


(4.3) 


w, = 2w, = (g/l)’?, with unnormalized | a,) = [3]. | 4) = uae 
O, = a,(3¢, -— 2¢,), Q, = a,(3¢, + ,), where a, and a, are nor- 
malization factors. 

a, =o, = (1 + ky?, wo, = (1 - 2k). 


Section 7-2. 


(2.1) 


aor 


Ds 
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1 il 1 2a 
4a 2a 
ee 3x 7 
2 
(2.4) (pee + 4a’ Ja 
(2.5) jonty from the center of the sphere. 


(2.6) 2 a(1 + cos 9), a’(1 — cos ¢). 


2 
(2.9) Ju= = E u — (a, + a, + a,)(a, + a, + a,)-u 
2 | eae 
3 4 4 
(a; %a,) ot ma- =i Ps =) 
4 3 4 
ad e 
4 4 3 
b*¢ + Ca’ + a’b? 
2.10 (oe ees 
ey) 6 @?+b?+e¢ 
(2.11) fu =72u 


(2.12) = (a X )? 
(213) €, =, & =r # F,. € = 6, —-16; 


Ju = a'[é,é,au + 96,6,au + €,é,Au]. 


(2.23) | Use the method at the end of Section 5-2. 


(2.24) r= 22 (8 + V3), 6 = 4.7° 


Section 7-6. 


(6.4) tang@ =1-24,0= a 


Hints and Solutions for Selected Exercises 631 


Appendix A 
(A.2) <a = ~iaA = -jaei* = giala- 72) 


(A.7) Area = 2|aab| + 2|bac| + 2|caal 
= 2ab sin C + 2bc sin A + 2ca sin A. 


From Exercise (A.6), 
Volume = |aabac| 
= abc[1 — cos’ A — cos” B — cos? C + 2 cos A cos B cos C]'”. 


(A.9) 4m = 2A + 2(2a— A) + 2(2b ~ A) + 2(2c — A). 
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There are many fine textbooks on classical mechanics, but only a couple are mentioned below as 
supplements to the present text. Most students spend too much time studying textbooks. They 
should begin to familiarize themselves with the wider scientific literature as soon as possible. The 
sooner a student penetrates the specialist literature on topics that interest him, the more rapidly 
he will approach the research frontier. He should not be afraid to tackle advanced monographs, 
for he will find that they often contain more licid treatments of the basics than introductory 
texts,and the difficult parts will alert him to specifics in his background that need to be filled in. 
Rather than read aimlessly in a broad field, he should focus on specific topics, search out the 
relevant literature, and determine what is required to master them. Above all, he should learn to 
see the scientific literature as a vast lode of exciting ideas which he can mine at will by himself. 

Most of the references below are intended as entries to the literature on offshoots and 
applications of mechanics. Many are classics in their fields, and some are advanced monographs, 
but all of them will yield rich rewards to the dedicated student. This is just a sampling of the 
literature with no attempt at completeness on any topic. 


Supplementary Texts 


The books by French and Feynman are notable for their rich physical insight communicated with 
a minimum of mathematics. Although both are introductory textbooks, they can be read with 
profit by advanced students. French’s book ts especially valuable for its historical information, 
which should be part of every physicists education. 

A. P. French, Newtonian Mechanics, Norton, N.Y. (1971). 

R. P. Feynman, Lectures on Physics, Vol. 1, Addison-Wesley, Reading (1963). 


References on Geometric Algebra 


The present book (NFI) is the only available introduction to geometric algebra and its appli- 
cations to mechanics. A sequel (NFII) in preparation will considerably broaden the range of 
applications. 

The only other published books on geometric algebra are advaned monographs, Space-Time 
Algebra and Geometric Calculus. The first deals tersely with applications to relativity. The second 
is devoted exclusively to mathematical developments of the calculus. It is advisable to become 
thoroughly familiar with NFI before addressing either of these books. 

D. Hestenes and G. Sobezyk, Clifford Algebra to Geometric Calculus, a Unified Language for 
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Mathematics and Physics, D. Reidel, Dordrecht (1984). [Referred to as Geometric Calculus 
in the text] 

D. Hestenes, Space-Time Algebra, Gordon and Breach, N.Y, (1966). 

D. Hestenes, New Foundations for Mathematical Physics, D. Reidel, Dordrecht (estimated 
publication date: Spring 1988). [Referred to as NFII in the text] 


Chapter 1 


A satisfactory history of geometric algebra has not yet been written. But Kline traces the 
interplay between geometry and algebra, mathematics and physics in their historical develop- 
ment. The scholarly work by Van der Waarden shows clearly the common historical origins of 
geometry and algebra. Clifford’s book is one of the best popular expositions ever written on the 
role of mathematics in science. 

M. Kline, Mathematical Thought from Ancient to Modern Times, Oxford U. Press, N.Y. 

(72). 
B. L. Van der Waarden, Science Awakening, Wiley, N.Y. (1963). 
W.K. Clifford, Common Sense of the Exact Sciences (1978), reprinted by Dover, N.Y. (1946). 


Section 2-6 


Zwikker gives an extensive treatment of plane analytic geometry, using complex numbers in a 
manner closely related to the techniques of Geometric Algebra. 
C. Zwikker, The Advanced Geometry of Plane Curves and Their Applications, Dover, N.Y. 
(1963). 


Section 3-1. 


This magnificently edited and annotated collection of Newton’s papers provides valuable insight 
into Newton’s genius. One can see, for instance, the extensive mathematical preparation in 
analytic geometry that preceded his great work in mechanics. 
D. T. Whiteside (ed.), The Mathematical Papers of Isaac Newton, Cambridge U. Press, 
Cambridge (1967-81), 8 Vol. 


Section 3-5. 


The forces of fluids on moving objects are extensively analyzed theoretically and empirically in 
Batchelor’s classic. 
G. K. Batchelor, An Introduction to Fluid Dynamics, Cambridge U. Press, N.Y. (1967). 


Section 3-6 and 3-7. 


Feynman gives a good introduction to electromagnetic fields and forces. 
R. P. Feynman, The Feynman Lectures on Physics, Vol. li, Addison-Wesley, Reading (1964). 
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Section 6-3. 


Brillouin’s classic is an object lesson in how much can be accomplished with a minimum of 
mathematics. He discusses electrical-mechanical analogies as well as waves in crystals. 
L. Brillouin, Wave Propagation in Periodic Structures, McGraw-Hill, N.Y. 1946 (Dover, N.Y. 
1953). 
Physics students will do well to sample the vast engineering and applied mathematics literature 
on linear systems theory. 


Section 6-4. 


Herzberg is still one of the most important references on molecular vibrations. Califano gives a 
more up-to-date treatment of group theoretic methods to account for molecular symmetry. 
Further improvement in these methods may be expected from employment of geometric algebra. 
G. Herzberg, Molecular Spectra and Molecular Structure, I. Infrared and Raman Spectra of 
Polyatomic Molecules, D. Von Nostrand Co., London, (1945). 
S. Califano, Vibrational States, Wiley, N.Y. (1976). 


Section 6-5. 


The most extensive survey of work on the restricted three body problem is, 
V. Szebehely, Theory of Orbits, Academic Press, N.Y. (1967). 


Section 7-4. 


This one of the standard advanced references on the theory of spinning bodies, as well as the 
three body problem. 
E. T. Whittaker, A Treatise on the Analytical Dynamics of Particles and Rigid Bodies, 
Cambridge, 4th Ed. (1937). 


Chapter 8 


Stacey and Kaula present fine introductions to the rich field of geophysics and its generalization 
to planetary physics, showing connections to celestial mechanics. The book by Munk and 
MacDonald is a class on the Earth’s rotation. 

Roy gives an up-to-date introduction to celestial mechanics and astromechanics combined. 
Kaplan gives a more complete treatment of spacecraft physics. 

F. D. Stacey, Physics of the Earth, Wiley, N.Y. (1969). 

W. M. Kaula, An Introduction to Planetary Physics, Wiley, N.Y. (1968). 

W. Munk and G. MacDonald, The Rotation of the Earth, Cambridge U. Press, London (1960). 

A. E. Roy, Orbital Motion, Adam Hilger, Bristol, 2nd Ed. (1982). 

M. H. Kaplan, Modern Spacecraft Dynamics and Control, Wiley, N.Y. (1976). 
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Section 8-4. 


Steifel and Schiefele is the sole reference on applications of the KS equation. The book provides 
many important insights into computational theory and technique. 
E. L. Stiefel and G. Scheifele, Linear and Regular Celestial Mechanics, Springer-Verlag, N.Y. 
(1971). 


Chapter 9. References on the Philosophy of Science 


The philosophy of science is in a disorderly state. Contradictory viewpoints and ill-considered 
opinions abound in the scientific and philosophical literature. An assessment of the situation in 
accord with the viewpoint in this book has been made by philosopher-physicist Mario Bunge: 

M. Bunge (1973), Philosophy of Physics, Dordrecht: D. Reidel Publ. Co. 

In his mature years, Bunge has undertaken a systematic formulation of the philosophy of 
science from the viewpoint of scientific realism and systems theory. His results will appear in an 
ambitious seven volume treatise of which several volumes have been published so far: 

M. Bunge (1974—_ ). Treatise on Basic Philosophy, 7 vol., Dordrecht: D. Reidel Publ. Co. 
This sophisticated treatise is not likely to appeal to readers with only a superficial interest in 
philosophy. However, Bunge’s systematic and careful analysis of basic concepts was invaluable in 
the preparation of the Chapter 9 here. 

Supplementing our discussion in Chapter 9, Rosen discusses the fundamental role of symmetry 
principles in the foundations of scientific theory. 

J. Rosen (1983), A Symmetry Primer for Scientists, Wiley, N.Y. 

As a worthy sample of writings on the foundations of physics, we have the following books by 
eminent mathematicians as well as physicists and philosophers: But beware of conflicting 
viewpoints and tacit assumptions! 

Campbell (1957). Foundations of Science, New York: Dover Publ., Inc., (formerly, Physics: 

The Elements, Cambridge, 1920). 

H. Jeffries (1973). Scientific Interference, Cambridge U. Press, Cambridge. 

J. C. Maxwell (1977). Matter and Motion, New York: Dover Publ., Inc. 

E. Nagel (1961). The Structure of Science, Harcourt, Brace & World, New York. 

R. Nevalinna (1964). Space Time and Relativity, Addison-Wesley, New York. 

E. T. Whittaker (1949). From Euclid to Eddington: A Study of Conceptions of the External 

World, Cambridge U. Press, Cambridge. 


Appendix B 


Jahnke and Emde is a standard reference on elliptic functions and elliptic integrals. 
E. Jahnke and F. Emde, Tables of Functions, Dover, N.Y. (1945). 
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acceleration, 98, 312 ballistic trajectory, 215 
centripetal, 312 barycentric coordinates, 82 
Coriolis, 312 basis, 53 

ambient velocity, 146 of a linear space, 53, 363 

amplitude of an oscillation, 168 multivector, 53 

analyticity principle, 122 vectorial, 49, 260 

angle, 66 beats, 365 
radian measure of, 219 billiards, 498, 503 

angular momentum, 195ff bivector (2-vector), 21 
base point, 423 basis of, 56 
bivector, 196 codirectional, 24 
change of, 423 interpretations of, 49 
conservation, 196, 338 blade, 34 
induced, 330 Brillouin zone, 374 
internal, 337 
intrinsic, 424 Cayley-Klein parameters, 480, 485, 495 
orbital, 337 celestial mechanics, 512ff 
total, 337 celestial pole, 458, 538 
vector, 196 celestial sphere, 466, 538 

Angular Momentum Theorem, 338 center of gravity, 433 

anharmonic oscillator, 165 center of mass, 230, 336 

anomaly, additivity principles for, 437 
eccentric, 532 of continuous body, 434 
mean, 532 symmetry principles for, 435ff 
true, 532.573 (see centroid) 

apocenter, 213 center of mass theorem, 336 

apse (see turning point), centroid, 438 

area, chain rule, 100, 105, 108 
directed, 70 Chasles’ theorem, 305 
integral, 112ff, 196 Chandler wobble, 458 

associative rule, 27, 32, 35 characteristic equation, 166, 171, 383 

astromechanics, 512 chord, 79 

asymptotic region, 210, 236 circle, equations for, 87ff 

attitude, 420 Classical Field Theory, 514, 591 
element, 529, 549 Clifford, 59 
spinor, 420 clocks, 

Atwood’s machine, 354 particle, 585 

axode, 428 photon, 592 
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coefficient of restitution, 505, 511 


collision, 
elastic, 236, 505 
inelastic, 346, 505 
commutative rule, 15, 35 
commutator, 44 
configuration space, 351, 382 
congruence, 3, 303 
conicoid, 91 
conic (section), 90ff, 207 
constants of motion, 
for Lagrange problem, 476 
for rigid motion, 425 
for three body problems, 399 
constraint, 181, 599 
bilateral, 188 
for rolling contact, 492 
for slipping, 497 
holonomic, 185, 351, 354 
unilateral, 188 
continuity, 97 
coordinates, 
complex, 371 
ecliptic, 238ff 
equitorial, 238ff 
generalized, 350, 381 
ignorable (cyclic), 358 
Jacobi, 406 
mass-weighted, 385 


normal (characteristic), 362, 364 


polar, 132, 194 

rectangular, 132 

symmetry, 388 
correspondence rules, 579, 581 
couple, 430 
Cramer’s rule, 254 
cross product, 60 


degrees of freedom, 351 
derivative, 
by a vector, 117 
convective, 109 
directional, 105, 107 
of a spinor, 307 
partial, 108 
scalar, 98 
total, 109 
Descartes, 5 
descriptors, 577 
definition, 
explicit, 580 
implicit, 580 


determinant, 62, 255 
of a frame, 261 
of a linear operator, 255, 260 
of a matrix, 258, 260 
differential, 107 
exact, 116 
differential equation, 125 
dihedral angle, 604 
dilation, 13, 52 
dimension, 34, 54 
Diophantes, 9 
directance, 82, 87, 93, 427 
direction, 11 
of a line, 48 
of a plane, 49 
dispersion relation, 373 
displacement, 
rigid, 303, 305 
screw, 305 
distance, 79 
distributive rule, 18, 25, 31, 35 
drag, 146 
atmospheric, 215, 563 
pressure, 149 
viscous, 149 
(see force law) 
drag coefficient Cp, 147 
drift velocity, 159 
dual, 56, 63 
dynamical equations, 454, 578 
(see equations of motion), 
dynamics, 198 


eccentricity, 90, 205 
eccentricity vector, 91, 205, 527 
ecliptic, 466, 539 
eigenvalue problem, 264ff 
brute force method, 384 
eigenvalues, 264ff 
degenerate, 266 
eigenvectors, 264ff, 272 
Einstein, 574 
elastic modulus, 374 
elastic solid, 360 
electromagnetic wave, 174 
ellipsoid, 276 
ellipse, 91, 96, 173, 174, 199, 203, 208 
semi-major axis of, 212 
elliptic functions, 222, 478, 481, 610ff 
modulus of, 610 
elliptic integral, 482, 490, 545, 547, B-611ff 
Ecliptic , 466 
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energy, 
conservation, 170 
Coriolis, 342 
diagrams, 223, 229 
dissipation, 177, 241 
ellipsoid, 487 
internal, 342ff 
kinetic, 182, 337 
internal, 341 
rotational, 338 
translational, 337 
potential, 182 
storage, 177, 364 
total, 182, 206, 528 
transfer, 238, 344, 364 
vibrational, 342 
epicycle, 201 
epitrochoid, 201, 204 
equality, 12, 37 
equations of motion, 125 


rotational (see spinor equations), 340, 


420 
secular, 531 
for orbital elements, 531 
translational, 335, 420 
equiangular spiral, 155 
equilibrium, 379 
mechanical, 429 
point, 409 
equimomental rigid bodies, 448 
equinoxes, 539 
equipotential surface, 116, 185 
escape velocity, 214 
Euclid, 29 
Euclidean spaces 
2-dimensional, 54 
3-dimensional, 54 
n-dimensional, 80 
Euler, 121 
Euler angles, 289, 294, 486, 490, 538 
Euler’s Law (equation), 340, 420, 454 
components of, 422ff 
Euler parameters, 382, 315 
exponential function, 66, 73ff, 281 


factorization, 45 

Faraday effect, 179 

field, 104 

first law of thermodynamics, 344 


body, 125 

centrifugal, 318, 332 
conservative, 181, 219 
contact, 125 

Coriolis, 319, 322, 324, 328 
fictitious (see force law), 317 
generalized, 353 

impulsive, 214, 501 
perturbing, 143, 165, 527 
superposition, 122 

tidal, 520 


force constants, 380 
force field, 184 


central, 219 
conservative, 184, 219 


force law, 122 


conservative, 181 ff 
constant, 126 
Coulomb, 205, 

with cutoff, 251 
electromagnetic, 123, 155 
frictional, 192, 471 
gravitational, 123, 200, 205, 513 
Hooke’s, 122, 361, 364 
inverse square, 200 
magnetic, 151 
phenomenological, 195 
resistive (see drag), 146 

linear, 134, 154 

quadratic, 140 


forces on a rigid body, 


concurrent, 433 
equipollent, 428 
parallel, 429 
reduction of, 428 


frame (see basis), 261 


body, 339 
Kepler, 529 
reciprocal, 262 


frequency, 


cutoff, 371 

cyclotron, 154 

Larmor, 328 

normal (characteristics), 362 
degenerate, 362 

oscillator, 168 

resonant, 177 
multiple, 397 


fluid resistance, 146ff 
force, 121 
binding, 164 


Galileo, 575 
geoid, 524, 526 
geometric algebra, 53, 55, 80 
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Index 


geometric product, 31, 39 
geometry, 
analytic, 78ff 
coordinate, 78 
Euclidean, 79 
non-Euclidean, 79 
geopotential, 525 
Gibbs, 60 
golden ratio, 226 
grade, 22, 30, 34 
gradient, 116 
Grassman, 12, 14, 28 
gravitational field, 513 
force exerted by, 513, 520 
of an axisymmetric body, 518 
of an extended object, 515ff 
source, 513 
superposition, 514 
gravitational potential, 514 
harmonic (multipole) expansion of, 517 
of a spherically symmetric body, 516 
gravitational quadrupole tensor, 517, 542 
gravity assist, 239, 242 
group, 
abstract, 296 
continuous, 298 
dirotation, 296 
Euclidean, 301 ff 
Galilean, 313 
orthogonal, 299 
representation, 297 
rotation, 296ff 
subgroup of, 299, 306 
transformation, 295 
translation, 300ff 
guiding center, 158 
gyroscope, 454 
gyroscopic stiffness, 455ff 


Hall effect, 160 
Halley’s comet, 214 
Hamilton, 59, 286 
Hamilton’s theorem, 295 
harmonic approximation, 380 
harmonic oscillator, 165 
anisotropic, 168 
coupled, 361 
damped, 170 
forced, 174 
in a uniform field, 173, 202, 325 
isotropic, 165 
heat transfer, 345 


helix, 154 
Hill’s regions, 416 
hodograph, 127, 204 
Hooke’s law, 122, 166 
hyperbola, 91, 96, 208 
branches, 213 
hyperbolic functions, 74 
hypothesis, 580 
hypotrochoid, 202, 204 


idempotent, 38 

impact parameter, 211, 245 
impulse, 501 

impulsive motion, 501 


inertia tensor, 253, 339, 421, 439 


additivity principles for, 442 
calculation of, 439 
canonical form for, 451 
derivative of, 340 
matrix elements of, 445 
of a plane lamina, 274 
principle axes of, 422 
principle values of, 422 
symmetries of, 448 
initial conditions, 125 
initial value problem, 208 


integrating factor, 134, 139, 152, 173 


interaction, 121 
gravitational, 513 
inner product, 16ff, 33, 36, 39 
of blades, 43 
inverse, 35, 37 
inversion, 293, 437 


Jacobi identity, 47, 83 
Jacobi’s integral, 408 


Kepler, 200 
Kepler motion, 527 
Kepler problem, 204 
2-body effects on, 233 
Kepler’s equation, 216ff, 533 
Kepler’s Laws, 
first, 198 
second, 196, 198 
third, 197, 198, 200, 203 
modification of, 233 
kinematical equation, 
for rotational motion, 454 
kinematics, 198 
KS equation, 569 
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Lagrange points, 409 trace, 295 

Lagrangian, 353 linear space, 53 

Lagrange’s equation, 190, 353, 380 dimension of, 54 

Lagrange’s method, 354 linear transformation (see linear operator), 
Lame’s equation, 491 Lissajous figure, 169 

Laplace expansion, 43, 261 local interaction principle, 588, 591, 593 
Laplace vector (see eccentricity vector), logarithms, 75ff 


Larmor’s theorem, 328 
lattice constant, 367 
law of composition, 
generic, 586 
specific, 587 
law of cosines, 19, 69 
spherical, 523, 606, 607 
law of sines, 26, 70 
spherical, 607 
law of tangents, 294 
lemniscate, 204 
lever, law of, 430 
line, 
equations for, 48, 81ff 
moment of, 82 
line integral, 109ff, 115 
line vector, 428 
linear algebra, 254 
linear dependence, 47 
linear functions, 107, 252ff 
(see linear operators) 
linear independence, 53 
linear operators, 253ff 
adjoint (transpose), 254 
canonical forms, 263, 270, 282 
derivative of, 316 
determinant of, 255, 260 
inverse, 260 
matrix element, 262 
matrix element, 257 
matrix representation of, 257 
nonsingular, 256 
orthogonal, 277 
improper, 278 
proper, 278 
polar decomposition, 291 
product, 253 
secular equation for, 265 
complex roots, 268 
degenerate, 266 
shear, 295 
skewsymmetric, 263 
symmetric (self-adjoint), 263, 269ff 
spectral form, 270 
square root, 271 


Lorentz electron theory, 179 
Lorentz force, 123 


Mach number, 149 
magnetic spin resonance, 473ff 
magnetron, 202 
magnitude, 3, 6 
of a bivector, 24 
of a multivector, 46 
of a vector, 12 
many body problem, 398 
constants of motion, 399 
mass, 230 
density, 434 
reduced, 230 
total, 336, 434 
matrix, 257 
determinant of, 259, 260 
equation, 258 
identity, 258 
product, 258 
sum, 258 
mean motion, 532 
measurement, 2, 581, 600 
model, 378, 577ff 
abstract, 599 
concrete, 599 
deployment, 596, 600 
development, 596 
process, 598 
ramified, 600 
modeling, 596ff 
stages, 596 
modulus, 
of a complex number, 51 
of an elliptic function, 610 
of a multivector, 46 
Mohr’s algorithm, 273 
moment arm, 428 
momentum, 236 
conservation, 236, 336, 591 
flux, 347 
transfer, 238, 240, 591 
motion, 121 
in rotating systems, 317ff 
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rigid, 306ff 
translational, 335 
(see rotational motion, periodic 
motion) 
multivector, 34 
even, 41 
homogeneous, 41, 42 
k-vector part of, 34, 39 
odd, 41 
reverse, 45 


natural frequency, 168 
Newton, 1, 120, 124, 574 
Newton’s Law of Gravitation, 398 

universality of (see force law), 201, 203 
Newton’s Laws of Motion, 

first, 588 

second, 41, 588, 590 

third, 588 

strong form, 335 
weak form, 335, 591 

nodes, 

ascending, 538 

line of, 290 

precession of, 540 
normal (to a surface), 116 
normal modes, 362 

degenerate, 383 

expansion, 364 

nondegenerate, 383 

normalization, 377 

orthogonality, 369 

wave form, 369 
number, 3, 5 

complex, 57 

directed, 11, 12, 34 

imaginary, 51 

real, 10, 11, 12 
nutation, 

luni-solar, 551 

of a Kepler orbit, 540 

of a top, 470 

of Moon’s orbit, 550 


oblateness, 
constant J,, 518 
of Earth, 459, 467 
perturbation, 542, 560 
Ohm’s law, 137 
orbit, 121, 585 
orbital averages, 253ff 
orbital elements, 527 


Eulerian, 538 

secular equations tor, 531 
orbital transfer, 214 
orientation, 16, 23, 51 
origin, 79 
operators (see linear operators), 50 
operational referition, 581 
oscillations, 

damped, 393 

forced, 395 

free, 382 

phase of, 168 

small (see vibrations), 378 
osculating orbit, 528 
outer product, 20, 23, 36, 39 

of blades, 43 
outermorphism, 255 


parabola, 91, 96, 126, 207 
Parallel Axis Theorem, 424 
parenthesis, 42 
preference convention, 42 
for sets, 48 
particle, 121 
unstable, 242 
pendulum, 
compound, 463, 477, 489 
conical, 467 
double. 355. 386 
damped, 395 
Foucault, 223 
gyroscopic, 462 
simple, 191, 463 
small oscillations of. 462 
spherical, 475 
pericenter, 91, 213 
perigee, 213 
perihelion, 213 
period, 
of an oscillator, 168 
of central force motion, 221 
of the Moon, 203 
periodic motion, 168, 478 
perturbation, 
oblateness, 542, 560 
theory, 141, 320 
gravitational, 527, 541 
third body, 541 
physical space, 80, 583 
plane, 
equations for, 86ff 
Poincaré, 399, 416 
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Poinsot’s construction, 487 
point of division, 84, 430 
polygonal approximation, 143 
position, 80, 121, 420, 584ff 
spinor, 564 
position space, 314, 584 
potential, 116 
attractive, 229 
barrier, 228 
central, 220 
centrifugal, 224 
effective, 220, 408, 414 
gravitational, 514, 516 
screened Coulomb, 224 
secular. 545 
Yukawa. 224 
precession, 222 
luni-solar, 552 
of Mercury’s perihelion, 542ff 
of pericenter, 530 
of the equinoxes, 468, 553 
relativistic, 
General, 559 
Special, 562 
precession of a rigid body, 
Eulerian free, 456, 467, 475 
steady, 463, 483 
deviations from, 467 
principle moments of inertia, 446 
principle values, 269, 292 
principle vectors, 269, 292 
of inertia tensor, 442 
projectile, 
Coriolis deflection, 321 ff 
range, 127, 136 
terminal velocity, 135 
time of flight, 130, 132 
projection, 16, 65, 270, 603 
properties, 
emergent, 582, 600 
physical, 576 
primary, 576 
qualitative, 578 
quantitative, 578 
secondary, 576 
pseudoscalar, 
dextral (right handed), 55 
of a plane, 49, 53 
of 3-space, 54, 57 


quantity, 34 
quaternion, 58, 62 
theory of rotations, 286 


radius of gyration, 447, 463 
reference class, 
of a model, 577 
of a property, 578 
of a theory, 581 
reference frame (body), 314, 584 
reference system, 317 
geocentric, 317 
heliocentric, 317 
inertial, 311, 586, 588, 589, 593 
topocentric, 317 
motion of, 327 
reflection, 278ff 
law of, 280 
rejection, 65 
regularization, 571 
Relativity, 


General Theory, 514, 542, 557, 574, 583 


Principle, 589, 593 
Special Theory, 562, 583 
relaxation time, 135 
resonance, 176 
cyclotron, 162 
electromagnetic, 175 
magnetic, 331, 473ff 
multiple, 396 
spin-orbit, 554 
reversion, 45 
Reynolds Number, 147 
rigid body classification, 448 
asymmetric, 448 
axially symmetric, 448, 454 
centrosymmetric, 448 
rocket propulsion, 348 
Rodrigues’ formula, 293 
rolling motion, 492ff 
rotation, 50, 278, 280ff 
axis, 304 
canonical form, 282, 288 
composition, 283 
group, 295ff 
matrix representation, 296 
spin representation, 296 


matrix elements, 286, with Euler angles, 294 


oriented, 283 
parametric form, 282 
physical, 297 

right hand rule, 282 
spinor theory of, 286 
rotational motion, 317 


integrable cases (see spinning top), 476 


of a particle system, 338 
of asymmetric body, 482 
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of the Earth, 327, 551 
stability of, 488 


satellite, 
orbital precession, 544 
perturbation of, 547 
synchronous orbit, 203 
scalar, 12 
scalar integration, 100 
scalar multiplication, 12, 24, 31, 35 
scattering, 
angle, 210, 245 
in CM system, 237, 242 
in LAB system, 239, 242 
Coulomb, 247, 250 
cross section, 243ff 
LAB and CM, 248ff 
Rutherford, 247 
elastic, 236 
for inverse square force, 210ff 
hard sphere, 246 
scientific explanation, 601 
scientific knowledge, 596 
scientific law, 579 
basic, 580 
derived, 580 
dynamical, 580 
generic, 580, 581 
interaction, 580, 588 
physical, 579 
scientific method, 595, 597 
scientific realism, 575 
scientific theory, 579 
semi-latus rectum, 91, 212 
sense (or orientation), 51 
siderial day, 458 
simultaneity, 583, 585, 592 
hypothesis of, 589, 592 
solar wind, 563 
solid angle, 244 
sphere, equations for, 87ff 
spherical excess, 609 
spinning top, 
fast, 466 
hanging (see precession, rotational 
motion), 466 
Lagrange problem for, 462, 479, 490 
rising, 473 
sleeping, 473 
slow, 466 
spherical, 460 
symmetrical, 454ff 
Eulerian motion of, 460 
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reduction of, 459 
spinor, 51, 52, 67 
derivative of, 307 
Eulerian form, 284 
improper, 300 
mechanics, 564 
parametrizations, 286 
unitary (unimodular), 280 
spinor equation of motion, 
for a spherical top, 461 
for a particle, 569 
stability, 165, 227, 380 
of circular orbits, 228 
of Lagrange points, 410ff 
of rotational motion, 488 
of satellite attitude, 553 
state function, 587 
state variables, 126 
Stokes’ Law, 147 
summation convention, 63 
super-ball, 507, 510 
superposition principle, 
for fields, 514 
for forces, 122, 588 
for vibrations, 363 


symmetry of a body, 435ff, 441 


system, 
2-particle, 230ff 
closed, 346 
configuration of, 350 
Earth-Moon, 234 
harmonic, 382 
isolated, 232, 336 
many-particle, 334ff 
open, 346 

systems theory, 578 
linear, 378 


Taylor expansion, 102, 107, 164 


temperature, 346 
tensor, 253 
three body problem, 400 
circular restricted, 407 
periodic solutions, 416 


classification of solutions, 404 


collinear solutions, 402 
restricted, 406 
triangular solutions, 402 

tidal friction, 522 

tides, 522 

time scale, 589 

tippie-top, 476 

torque, 338 
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base point, 424 
gravitational, 520 
moment arm, 428 
translation, 300 
(see group) 
trivector, 26 
trochoid, 159, 217 
turning points, 213, 221, 227 
trigonometric functions, 74, 281 
trigonometry, 20, 68 
identities, 71, 294 
spherical, 603 


units, 614 


variables, 
descriptive, 577 
instantiation of, 585 
interaction, 334, 578 
kinematic, 421 
macroscopic, 345 
object, 421, 577 
position, 334, 584 
state, 420, 577 

vector, 12 
addition, [5 
axial, 61 
collinear (codirectional), 16, 64 
identities, 62 
negative, 15 
orthogonal, 49, 64 
orthonormal, 55 
polar, 61 
rectangular components, 49, 56 
square, 35 
units, 13 

vector field, 184 


vector space, 49, 53 

velocity, 98 
additional theorem, 314, 593 
angular, 309 
complex, 427 
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rotational with Euler angles, 308, 423, 


315, 490 

spinor, 564 

translational, 309 
velocity filter, 160 
vibrations, 

of HO, 392 

lattice, 366 

molecular, 341, 387ff 

small, 341, 378 
Vieta, 9 


wave, 
harmonic, 372 
polarized, 375 
standing, 372 
traveling, 372 
wavelength, 369 
wave number, 369 
weight, 
apparent, 318 
true, 318 
work, 183, 342ff 
microscopic, 345 
Work-Energy Theorem, 343 
wrench, 426 
reduction of, 431 
superposition principle, 428 


Zeeman effect, 332 
zero, 14 
Zeroth Law, 80, 580, 582ff. 586 
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geometrical and physical intuition. Besides covering the standard material 
for a course on the mechanics of particles and rigid bodies, the book in- 
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mechanics, developing these subjects to a level well beyond that of other 
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and spinors. Thus, it increases the power of the mathematical language of 
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mechanics. This book systematically develops purely mathematical applica- 
tions of geometric algebra useful in physics, including extensive applications 
to linear algebra and transformation groups. It contains sufficient material for 
a course on mathematical topics alone. 

Besides a reformulation of the mathematical foundations of mechanics, the 
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from the modern perspective of Modeling Theory. This should be of interest 
to readers concerned with the philosophy of science. 
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